Biblioteca Digital

81 resultados para Optical pattern recognition.

Entropy based Skew correction of document images

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The document images that are fed into an Optical Character Recognition system, might be skewed. This could be due to improper feeding of the document into the scanner or may be due to a faulty scanner. In this paper, we propose a skew detection and correction method for document images. We make use of the inherent randomness in the Horizontal Projection profiles of a text block image, as the skew of the image varies. The proposed algorithm has proved to be very robust and time efficient. The entire process takes less than a second on a 2.4 GHz Pentium IV PC.

Artificial database for character recognition research

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes a technique for artificial generation of learning and test sample sets suitable for character recognition research. Sample sets of English (Latin), Malayalam, Kannada and Tamil characters are generated easily through their prototype specifications by the endpoint co-ordinates, nature of segments and connectivity.

A heuristic clustering algorithm using union of overlapping pattern-cells

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Relative geometric arrangements of the sample points, with reference to the structure of the imbedding space, produce clusters. Hence, if each sample point is imagined to acquire a volume of a small M-cube (called pattern-cell), depending on the ranges of its (M) features and number (N) of samples; then overlapping pattern-cells would indicate naturally closer sample-points. A chain or blob of such overlapping cells would mean a cluster and separate clusters would not share a common pattern-cell between them. The conditions and an analytic method to find such an overlap are developed. A simple, intuitive, nonparametric clustering procedure, based on such overlapping pattern-cells is presented. It may be classified as an agglomerative, hierarchical, linkage-type clustering procedure. The algorithm is fast, requires low storage and can identify irregular clusters. Two extensions of the algorithm, to separate overlapping clusters and to estimate the nature of pattern distributions in the sample space, are also indicated.

A knowledge-based approach to pattern generation

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Pattern Cognition is looked at from the functional view point. The need for knowledge in synthesizing such patterns is explained and various aspects of knowledge-based pattern generation are highlighted. This approach to the generation of patterns is detailed with a concrete example.

Character recognition — A review

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The machine replication of human reading has been the subject of intensive research for more than three decades. A large number of research papers and reports have already been published on this topic. Many commercial establishments have manufactured recognizers of varying capabilities. Handheld, desk-top, medium-size and large systems costing as high as half a million dollars are available, and are in use for various applications. However, the ultimate goal of developing a reading machine having the same reading capabilities of humans still remains unachieved. So, there still is a great gap between human reading and machine reading capabilities, and a great amount of further effort is required to narrow-down this gap, if not bridge it. This review is organized into six major sections covering a general overview (an introduction), applications of character recognition techniques, methodologies in character recognition, research work in character recognition, some practical OCRs and the conclusions.

Improved recognition of aged Kannada documents by effective segmentation of merged characters

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In optical character recognition of very old books, the recognition accuracy drops mainly due to the merging or breaking of characters. In this paper, we propose the first algorithm to segment merged Kannada characters by using a hypothesis to select the positions to be cut. This method searches for the best possible positions to segment, by taking into account the support vector machine classifier's recognition score and the validity of the aspect ratio (width to height ratio) of the segments between every pair of cut positions. The hypothesis to select the cut position is based on the fact that a concave surface exists above and below the touching portion. These concave surfaces are noted down by tracing the valleys in the top contour of the image and similarly doing it for the image rotated upside-down. The cut positions are then derived as closely matching valleys of the original and the rotated images. Our proposed segmentation algorithm works well for different font styles, shapes and sizes better than the existing vertical projection profile based segmentation. The proposed algorithm has been tested on 1125 different word images, each containing multiple merged characters, from an old Kannada book and 89.6% correct segmentation is achieved and the character recognition accuracy of merged words is 91.2%. A few points of merge are still missed due to the absence of a matched valley due to the specific shapes of the particular characters meeting at the merges.

Implementation of Art1 and Art2 Artificial Neural Networks on Ring and Mesh Architectures

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Artificial Neural Networks (ANNs) are being used to solve a variety of problems in pattern recognition, robotic control, VLSI CAD and other areas. In most of these applications, a speedy response from the ANNs is imperative. However, ANNs comprise a large number of artificial neurons, and a massive interconnection network among them. Hence, implementation of these ANNs involves execution of computer-intensive operations. The usage of multiprocessor systems therefore becomes necessary. In this article, we have presented the implementation of ART1 and ART2 ANNs on ring and mesh architectures. The overall system design and implementation aspects are presented. The performance of the algorithm on ring, 2-dimensional mesh and n-dimensional mesh topologies is presented. The parallel algorithm presented for implementation of ART1 is not specific to any particular architecture. The parallel algorithm for ARTE is more suitable for a ring architecture.

Speeding up AdaBoost Classifier with Random Projection

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.

A fuzzy multistage evolutionary (FUME) clustering technique

Relevância:

80.00% 80.00%

Publicador:

Resumo:

n this paper, a multistage evolutionary scheme is proposed for clustering in a large data base, like speech data. This is achieved by clustering a small subset of the entire sample set in each stage and treating the cluster centroids so obtained as samples, together with another subset of samples not considered previously, as input data to the next stage. This is continued till the whole sample set is exhausted. The clustering is accomplished by constructing a fuzzy similarity matrix and using the fuzzy techniques proposed here. The technique is illustrated by an efficient scheme for voiced-unvoiced-silence classification of speech.

Distribution of black nodes at various levels in a linear quadtree

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The distribution of black leaf nodes at each level of a linear quadtree is of significant interest in the context of estimation of time and space complexities of linear quadtree based algorithms. The maximum number of black nodes of a given level that can be fitted in a square grid of size 2n × 2n can readily be estimated from the ratio of areas. We show that the actual value of the maximum number of nodes of a level is much less than the maximum obtained from the ratio of the areas. This is due to the fact that the number of nodes possible at a level k, 0≤k≤n − 1, should consider the sum of areas occupied by the actual number of nodes present at levels k + 1, k + 2, …, n − 1.

A knowledge-based clustering scheme

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper the notion of conceptual cohesiveness is precised and used to group objects semantically, based on a knowledge structure called ‘cohesion forest’. A set of axioms is proposed which should be satisfied to make the generated clusters meaningful.

Learning Optimal Discriminant Functions through a Cooperative Game of Automata

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The problem of learning correct decision rules to minimize the probability of misclassification is a long-standing problem of supervised learning in pattern recognition. The problem of learning such optimal discriminant functions is considered for the class of problems where the statistical properties of the pattern classes are completely unknown. The problem is posed as a game with common payoff played by a team of mutually cooperating learning automata. This essentially results in a probabilistic search through the space of classifiers. The approach is inherently capable of learning discriminant functions that are nonlinear in their parameters also. A learning algorithm is presented for the team and convergence is established. It is proved that the team can obtain the optimal classifier to an arbitrary approximation. Simulation results with a few examples are presented where the team learns the optimal classifier.

Syntactic Approach to ECG Rhythm Analysis

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A diagnostic system for ECG rhythm monitoring based on syntactic approaches to pattern recognition is presented here. The method proposed exploits the difference in shape and structure between arrhythmic and normal ECG patterns to generate distinctly different descriptions in terms of a chosen set of primitives. A given frame of signal is first approximated piecewise linearly into a set of line segments which are completely specified in terms of their length and slope values. The slope values are quantized into seven distinct levels and a unit-length line segment with a slope value in each of these levels is coded as a slope symbol. Seven such slope symbols constitute the set of primitives. The given signal is represented as a string of such symbols based on the length and angle of the line segments approximating the signal. Context-free languages are used for describing the classes of abnormal and normal ECG patterns considered here. Analysis of actual ECG data shows efficiency comparable with that of existing methods and a saving in processing time.

A computationally efficient technique for data-clustering

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A computationally efficient agglomerative clustering algorithm based on multilevel theory is presented. Here, the data set is divided randomly into a number of partitions. The samples of each such partition are clustered separately using hierarchical agglomerative clustering algorithm to form sub-clusters. These are merged at higher levels to get the final classification. This algorithm leads to the same classification as that of hierarchical agglomerative clustering algorithm when the clusters are well separated. The advantages of this algorithm are short run time and small storage requirement. It is observed that the savings, in storage space and computation time, increase nonlinearly with the sample size.

Estimation of fuzzy memberships from histograms

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Based on the conclusions drawn in the bijective transformation between possibility and probability, a method is proposed to estimate the fuzzy membership function for pattern recognition purposes. A rational function approximation to the probability density function is obtained from the histogram of a finite (and sometimes very small) number of samples. This function is normalized such that the highest ordinate is one. The parameters representing the rational function are used for classifying the pattern samples based on a max-min decision rule. The method is illustrated with examples.

«
1
2
3
4
5
6
»