834 resultados para Naïve Bayes classifier
Resumo:
Nearest neighbor search is commonly employed in face recognition but it does not scale well to large dataset sizes. A strategy to combine rejection classifiers into a cascade for face identification is proposed in this paper. A rejection classifier for a pair of classes is defined to reject at least one of the classes with high confidence. These rejection classifiers are able to share discriminants in feature space and at the same time have high confidence in the rejection decision. In the face identification problem, it is possible that a pair of known individual faces are very dissimilar. It is very unlikely that both of them are close to an unknown face in the feature space. Hence, only one of them needs to be considered. Using a cascade structure of rejection classifiers, the scope of nearest neighbor search can be reduced significantly. Experiments on Face Recognition Grand Challenge (FRGC) version 1 data demonstrate that the proposed method achieves significant speed up and an accuracy comparable with the brute force Nearest Neighbor method. In addition, a graph cut based clustering technique is employed to demonstrate that the pairwise separability of these rejection classifiers is capable of semantic grouping.
Resumo:
We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number of all text characters on a web page. K-means clustering is used to create unique thresholds to differentiate index pages and article pages on individual web sites. Index pages contain mostly links to articles and other indices, while article pages contain mostly text. We also present a novel link grouping algorithm using agglomerative hierarchical clustering that groups links in the same spatial neighborhood together while preserving link structure. Grouping allows users with severe disabilities to use a scan-based mechanism to tab through a web page and select items. In experiments, we saw up to a 40-fold reduction in the number of commands needed to click on a link with a scan-based interface, which shows that we can vastly improve the rate of communication for users with disabilities. We used web page classification and link grouping to alter web page display on an accessible web browser that we developed to make a usable browsing interface for users with disabilities. Our classification method consistently outperformed a baseline classifier even when using minimal data to generate article and index clusters, and achieved classification accuracy of 94.0% on web sites with well-formed or slightly malformed HTML, compared with 80.1% accuracy for the baseline classifier.
Resumo:
A procedure that uses fuzzy ARTMAP and K-Nearest Neighbor (K-NN) categorizers to evaluate intrinsic and extrinsic speaker normalization methods is described. Each classifier is trained on preprocessed, or normalized, vowel tokens from about 30% of the speakers of the Peterson-Barney database, then tested on data from the remaining speakers. Intrinsic normalization methods included one nonscaled, four psychophysical scales (bark, bark with end-correction, mel, ERB), and three log scales, each tested on four different combinations of the fundamental (Fo) and the formants (F1 , F2, F3). For each scale and frequency combination, four extrinsic speaker adaptation schemes were tested: centroid subtraction across all frequencies (CS), centroid subtraction for each frequency (CSi), linear scale (LS), and linear transformation (LT). A total of 32 intrinsic and 128 extrinsic methods were thus compared. Fuzzy ARTMAP and K-NN showed similar trends, with K-NN performing somewhat better and fuzzy ARTMAP requiring about 1/10 as much memory. The optimal intrinsic normalization method was bark scale, or bark with end-correction, using the differences between all frequencies (Diff All). The order of performance for the extrinsic methods was LT, CSi, LS, and CS, with fuzzy AHTMAP performing best using bark scale with Diff All; and K-NN choosing psychophysical measures for all except CSi.
Resumo:
Training data for supervised learning neural networks can be clustered such that the input/output pairs in each cluster are redundant. Redundant training data can adversely affect training time. In this paper we apply two clustering algorithms, ART2 -A and the Generalized Equality Classifier, to identify training data clusters and thus reduce the training data and training time. The approach is demonstrated for a high dimensional nonlinear continuous time mapping. The demonstration shows six-fold decrease in training time at little or no loss of accuracy in the handling of evaluation data.
Resumo:
The recognition of 3-D objects from sequences of their 2-D views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation of an image, a supervised incremental learning system that classifies the preprocessed representations into 2-D view categories whose outputs arc combined into 3-D invariant object categories, and a working memory that makes a 3-D object prediction by accumulating evidence from 3-D object category nodes as multiple 2-D views are experienced. The simplest VIEWNET achieves high recognition scores without the need to explicitly code the temporal order of 2-D views in working memory. Working memories are also discussed that save memory resources by implicitly coding temporal order in terms of the relative activity of 2-D view category nodes, rather than as explicit 2-D view transitions. Variants of the VIEWNET architecture may also be used for scene understanding by using a preprocessor and classifier that can determine both What objects are in a scene and Where they are located. The present VIEWNET preprocessor includes the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and suppresses image noise. This boundary segmentation is rendered invariant under 2-D translation, rotation, and dilation by use of a log-polar transform. The invariant spectra undergo Gaussian coarse coding to further reduce noise and 3-D foreshortening effects, and to increase generalization. These compressed codes are input into the classifier, a supervised learning system based on the fuzzy ARTMAP algorithm. Fuzzy ARTMAP learns 2-D view categories that are invariant under 2-D image translation, rotation, and dilation as well as 3-D image transformations that do not cause a predictive error. Evidence from sequence of 2-D view categories converges at 3-D object nodes that generate a response invariant under changes of 2-D view. These 3-D object nodes input to a working memory that accumulates evidence over time to improve object recognition. ln the simplest working memory, each occurrence (nonoccurrence) of a 2-D view category increases (decreases) the corresponding node's activity in working memory. The maximally active node is used to predict the 3-D object. Recognition is studied with noisy and clean image using slow and fast learning. Slow learning at the fuzzy ARTMAP map field is adapted to learn the conditional probability of the 3-D object given the selected 2-D view category. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of l28x128 2-D views of aircraft with and without additive noise. A recognition rate of up to 90% is achieved with one 2-D view and of up to 98.5% correct with three 2-D views. The properties of 2-D view and 3-D object category nodes are compared with those of cells in monkey inferotemporal cortex.
Resumo:
Adaptive Resonance Theory (ART) models are real-time neural networks for category learning, pattern recognition, and prediction. Unsupervised fuzzy ART and supervised fuzzy ARTMAP synthesize fuzzy logic and ART networks by exploiting the formal similarity between the computations of fuzzy subsethood and the dynamics of ART category choice, search, and learning. Fuzzy ART self-organizes stable recognition categories in response to arbitrary sequences of analog or binary input patterns. It generalizes the binary ART 1 model, replacing the set-theoretic: intersection (∩) with the fuzzy intersection (∧), or component-wise minimum. A normalization procedure called complement coding leads to a symmetric: theory in which the fuzzy inter:>ec:tion and the fuzzy union (∨), or component-wise maximum, play complementary roles. Complement coding preserves individual feature amplitudes while normalizing the input vector, and prevents a potential category proliferation problem. Adaptive weights :otart equal to one and can only decrease in time. A geometric interpretation of fuzzy AHT represents each category as a box that increases in size as weights decrease. A matching criterion controls search, determining how close an input and a learned representation must be for a category to accept the input as a new exemplar. A vigilance parameter (p) sets the matching criterion and determines how finely or coarsely an ART system will partition inputs. High vigilance creates fine categories, represented by small boxes. Learning stops when boxes cover the input space. With fast learning, fixed vigilance, and an arbitrary input set, learning stabilizes after just one presentation of each input. A fast-commit slow-recode option allows rapid learning of rare events yet buffers memories against recoding by noisy inputs. Fuzzy ARTMAP unites two fuzzy ART networks to solve supervised learning and prediction problems. A Minimax Learning Rule controls ARTMAP category structure, conjointly minimizing predictive error and maximizing code compression. Low vigilance maximizes compression but may therefore cause very different inputs to make the same prediction. When this coarse grouping strategy causes a predictive error, an internal match tracking control process increases vigilance just enough to correct the error. ARTMAP automatically constructs a minimal number of recognition categories, or "hidden units," to meet accuracy criteria. An ARTMAP voting strategy improves prediction by training the system several times using different orderings of the input set. Voting assigns confidence estimates to competing predictions given small, noisy, or incomplete training sets. ARPA benchmark simulations illustrate fuzzy ARTMAP dynamics. The chapter also compares fuzzy ARTMAP to Salzberg's Nested Generalized Exemplar (NGE) and to Simpson's Fuzzy Min-Max Classifier (FMMC); and concludes with a summary of ART and ARTMAP applications.
Resumo:
A neural model is proposed of how laminar interactions in the visual cortex may learn and recognize object texture and form boundaries. The model brings together five interacting processes: region-based texture classification, contour-based boundary grouping, surface filling-in, spatial attention, and object attention. The model shows how form boundaries can determine regions in which surface filling-in occurs; how surface filling-in interacts with spatial attention to generate a form-fitting distribution of spatial attention, or attentional shroud; how the strongest shroud can inhibit weaker shrouds; and how the winning shroud regulates learning of texture categories, and thus the allocation of object attention. The model can discriminate abutted textures with blurred boundaries and is sensitive to texture boundary attributes like discontinuities in orientation and texture flow curvature as well as to relative orientations of texture elements. The model quantitatively fits a large set of human psychophysical data on orientation-based textures. Object boundar output of the model is compared to computer vision algorithms using a set of human segmented photographic images. The model classifies textures and suppresses noise using a multiple scale oriented filterbank and a distributed Adaptive Resonance Theory (dART) classifier. The matched signal between the bottom-up texture inputs and top-down learned texture categories is utilized by oriented competitive and cooperative grouping processes to generate texture boundaries that control surface filling-in and spatial attention. Topdown modulatory attentional feedback from boundary and surface representations to early filtering stages results in enhanced texture boundaries and more efficient learning of texture within attended surface regions. Surface-based attention also provides a self-supervising training signal for learning new textures. Importance of the surface-based attentional feedback in texture learning and classification is tested using a set of textured images from the Brodatz micro-texture album. Benchmark studies vary from 95.1% to 98.6% with attention, and from 90.6% to 93.2% without attention.
Resumo:
In this work, we investigate tennis stroke recognition using a single inertial measuring unit attached to a player’s forearm during a competitive match. This paper evaluates the best approach for stroke detection using either accelerometers, gyroscopes or magnetometers, which are embedded into the inertial measuring unit. This work concludes what is the optimal training data set for stroke classification and proves that classifiers can perform well when tested on players who were not used to train the classifier. This work provides a significant step forward for our overall goal, which is to develop next generation sports coaching tools using both inertial and visual sensors in an instrumented indoor sporting environment.
Resumo:
The parasite Bonamia ostreae has decimated Ostrea edulis stocks throughout Europe. The complete life cycle and means of transmission of the parasite remains unknown. The methods used to diagnose B. ostreae were examined to determine sensitivity and reproducibility. Two methods, with fixed protocols, should be used for the accurate detection of infection within a sample. A 13-month study of two stocks of O. edulis with varying periods of exposure to B. ostreae, was undertaken to determine if varying lengths of exposure would translate into observations of differing susceptibility. Oyster stocks can maintain themselves over extended periods of time in B. ostreae endemic areas. To identify a well performing spat stock, which could be used to repopulate beds within the region, hatchery bred spat from three stocks found in the North sea were placed on a B. ostreae infected bed and screened for growth, mortality and prevalence of infection. Local environmental factors may influence oyster performance, with local stocks better adapted to these conditions. Sediment and macroinvertebrate species were screened to investigate mechanisms by which B. ostreae may be maintaining itself on oyster beds. Mytilus edulis was positive, indicating that B. ostreae may use incidental carriers as a method of maintaining itself. The ability of oyster larvae to pick up infection from the surrounding environment was investigated by collecting larvae from brooding oysters from different areas. Larvae may acquire the pathogen from the water column during the process of filter feeding by the brooding adult, even when the parents themselves are uninfected. A study was undertaken to elucidate the activity of the parasite during the initial stage of infection, when it cannot be detected within the host. A naïve stock screened negative for infection throughout the trial, using heart imprints and PCR yet B. ostreae was detected by in-situ hybridisation.
Resumo:
The electroencephalogram (EEG) is an important noninvasive tool used in the neonatal intensive care unit (NICU) for the neurologic evaluation of the sick newborn infant. It provides an excellent assessment of at-risk newborns and formulates a prognosis for long-term neurologic outcome.The automated analysis of neonatal EEG data in the NICU can provide valuable information to the clinician facilitating medical intervention. The aim of this thesis is to develop a system for automatic classification of neonatal EEG which can be mainly divided into two parts: (1) classification of neonatal EEG seizure from nonseizure, and (2) classifying neonatal background EEG into several grades based on the severity of the injury using atomic decomposition. Atomic decomposition techniques use redundant time-frequency dictionaries for sparse signal representations or approximations. The first novel contribution of this thesis is the development of a novel time-frequency dictionary coherent with the neonatal EEG seizure states. This dictionary was able to track the time-varying nature of the EEG signal. It was shown that by using atomic decomposition and the proposed novel dictionary, the neonatal EEG transition from nonseizure to seizure states could be detected efficiently. The second novel contribution of this thesis is the development of a neonatal seizure detection algorithm using several time-frequency features from the proposed novel dictionary. It was shown that the time-frequency features obtained from the atoms in the novel dictionary improved the seizure detection accuracy when compared to that obtained from the raw EEG signal. With the assistance of a supervised multiclass SVM classifier and several timefrequency features, several methods to automatically grade EEG were explored. In summary, the novel techniques proposed in this thesis contribute to the application of advanced signal processing techniques for automatic assessment of neonatal EEG recordings.
Resumo:
As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. For classifier comparison we used two performance metrics: The receiving operator characteristic (ROC) area under the curve [area under the ROC curve (AUC)] and the normalized partial area under the curve (pAUC). This study used four classifiers: Linear discriminant analysis (LDA), artificial neural network (ANN), and two variants of our decision-fusion technique, AUC-optimized (DF-A) and pAUC-optimized (DF-P) decision fusion. We applied each of these classifiers with 100-fold cross-validation to two heterogeneous breast cancer data sets: One of mass lesion features and a much more challenging one of microcalcification lesion features. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p < 0.02) and achieved AUC=0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p < 0.01) and reached pAUC=0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p < 0.04) and achieved AUC=0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC=0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p < 0.04). In conclusion, decision fusion directly optimized clinically significant performance measures, such as AUC and pAUC, and sometimes outperformed two well-known machine-learning techniques when applied to two different breast cancer data sets.
Resumo:
The array of human immunodeficiency virus (HIV) subtypes encountered in East London, an area long associated with migration, is unusually heterogeneous, reflecting the diverse geographical origins of the population. In this study it was shown that viral subtypes or clades infecting a sample of HIV type 1 (HIV-1)-positive individuals in East London reflect the global pandemic. The authors studied the humoral response in 210 treatment-naïve chronically HIV-1-infected (>1 year) adult subjects against a panel of 12 viruses from six different clades. Plasmas from individuals infected with clade C, but also plasmas from clade A, and to a lesser degree clade CRF02_AG and CRF01_AE, were significantly more potent at neutralizing the tested viruses compared with plasmas from individuals infected with clade B. The difference in humoral robustness between clade C- and B-infected patients was confirmed in titration studies with an extended panel of clade B and C viruses. These results support the approach to develop an HIV-1 vaccine that includes clade C or A envelope protein (Env) immunogens for the induction of a potent neutralizing humoral response.
Resumo:
We develop a model for stochastic processes with random marginal distributions. Our model relies on a stick-breaking construction for the marginal distribution of the process, and introduces dependence across locations by using a latent Gaussian copula model as the mechanism for selecting the atoms. The resulting latent stick-breaking process (LaSBP) induces a random partition of the index space, with points closer in space having a higher probability of being in the same cluster. We develop an efficient and straightforward Markov chain Monte Carlo (MCMC) algorithm for computation and discuss applications in financial econometrics and ecology. This article has supplementary material online.
Resumo:
BACKGROUND: Speciation begins when populations become genetically separated through a substantial reduction in gene flow, and it is at this point that a genetically cohesive set of populations attain the sole property of species: the independent evolution of a population-level lineage. The comprehensive delimitation of species within biodiversity hotspots, regardless of their level of divergence, is important for understanding the factors that drive the diversification of biota and for identifying them as targets for conservation. However, delimiting recently diverged species is challenging due to insufficient time for the differential evolution of characters--including morphological differences, reproductive isolation, and gene tree monophyly--that are typically used as evidence for separately evolving lineages. METHODOLOGY: In this study, we assembled multiple lines of evidence from the analysis of mtDNA and nDNA sequence data for the delimitation of a high diversity of cryptically diverged population-level mouse lemur lineages across the island of Madagascar. Our study uses a multi-faceted approach that applies phylogenetic, population genetic, and genealogical analysis for recognizing lineage diversity and presents the most thoroughly sampled species delimitation of mouse lemur ever performed. CONCLUSIONS: The resolution of a large number of geographically defined clades in the mtDNA gene tree provides strong initial evidence for recognizing a high diversity of population-level lineages in mouse lemurs. We find additional support for lineage recognition in the striking concordance between mtDNA clades and patterns of nuclear population structure. Lineages identified using these two sources of evidence also exhibit patterns of population divergence according to genealogical exclusivity estimates. Mouse lemur lineage diversity is reflected in both a geographically fine-scaled pattern of population divergence within established and geographically widespread taxa, as well as newly resolved patterns of micro-endemism revealed through expanded field sampling into previously poorly and well-sampled regions.
Resumo:
Today, the only surviving wild population of giant tortoises in the Indian Ocean occurs on the island of Aldabra. However, giant tortoises once inhabited islands throughout the western Indian Ocean. Madagascar, Africa, and India have all been suggested as possible sources of colonization for these islands. To address the origin of Indian Ocean tortoises (Dipsochelys, formerly Geochelone gigantea), we sequenced the 12S, 16S, and cyt b genes of the mitochondrial DNA. Our phylogenetic analysis shows Dipsochelys to be embedded within the Malagasy lineage, providing evidence that Indian Ocean giant tortoises are derived from a common Malagasy ancestor. This result points to Madagascar as the source of colonization for western Indian Ocean islands by giant tortoises. Tortoises are known to survive long oceanic voyages by floating with ocean currents, and thus, currents flowing northward towards the Aldabra archipelago from the east coast of Madagascar would have provided means for the colonization of western Indian Ocean islands. Additionally, we found an accelerated rate of sequence evolution in the two Malagasy Pyxis species examined. This finding supports previous theories that shorter generation time and smaller body size are related to an increase in mitochondrial DNA substitution rate in vertebrates.