78 resultados para Machine Typed Document


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report a hierarchical blind script identifier for 11 different Indian scripts. An initial grouping of the 11 scripts is accomplished at the first level of this hierarchy. At the subsequent level, we recognize the script in each group. The various nodes of this tree use different feature-classifier combinations. A database of 20,000 words of different font styles and sizes is collected and used for each script. Effectiveness of Gabor and Discrete Cosine Transform features has been independently, evaluated using nearest neighbor linear discriminant and support vector machine classifiers. The minimum and maximum accuracies obtained, using this hierarchical mechanism, are 92.2% and 97.6%, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a new feature-based approach for mosaicing of camera-captured document images. A novel block-based scheme is employed to ensure that corners can be reliably detected over a wide range of images. 2-D discrete cosine transform is computed for image blocks defined around each of the detected corners and a small subset of the coefficients is used as a feature vector A 2-pass feature matching is performed to establish point correspondences from which the homography relating the input images could be computed. The algorithm is tested on a number of complex document images casually taken from a hand-held camera yielding convincing results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Skew correction of complex document images is a difficult task. We propose an edge-based connected component approach for robust skew correction of documents with complex layout and content. The algorithm essentially consists of two steps - an 'initialization' step to determine the image orientation from the centroids of the connected components and a 'search' step to find the actual skew of the image. During initialization, we choose two different sets of points regularly spaced across the the image, one from the left to right and the other from top to bottom. The image orientation is determined from the slope between the two succesive nearest neighbors of each of the points in the chosen set. The search step finds succesive nearest neighbors that satisfy the parameters obtained in the initialization step. The final skew is determined from the slopes obtained in the 'search' step. Unlike other connected component based methods, the proposed method does not require any binarization step that generally precedes connected component analysis. The method works well for scanned documents with complex layout of any skew with a precision of 0.5 degrees.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes an approach based on Zernike moments and Delaunay triangulation for localization of hand-written text in machine printed text documents. The Zernike moments of the image are first evaluated and we classify the text as hand-written using the nearest neighbor classifier. These features are independent of size, slant, orientation, translation and other variations in handwritten text. We then use Delaunay triangulation to reclassify the misclassified text regions. When imposing Delaunay triangulation on the centroid points of the connected components, we extract features based on the triangles and reclassify the text. We remove the noise components in the document as part of the preprocessing step so this method works well on noisy documents. The success rate of the method is found to be 86%. Also for specific hand-written elements such as signatures or similar text the accuracy is found to be even higher at 93%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Inventory management (IM) has a decisive role in the enhancement of manufacturing industry's competitiveness. Therefore, major manufacturing industries are following IM practices with the intention of improving their performance. However, the effort to introduce IM in SMEs is very limited due to lack of initiation, expertise, and financial constraints. This paper aims to provide a guideline for entrepreneurs in enhancing their IM performance, as it presents the results of a survey based study carried out for machine tool Small and Medium Enterprises (SMEs) in Bangalore. Having established the significance of inventory as an input, we probed the relationship between IM performance and economic performance of these SMEs. To the extent possible all the factors of production and performance indicators were deliberately considered in pure economic terms. All economic performance indicators adopted seem to have a positive and significant association with IM performance in SMEs. On the whole, we found that SMEs which are IM efficient are likely to perform better on the economic front also and experience higher returns to scale.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The determination of the overconsolidation ratio (OCR) of clay deposits is an important task in geotechnical engineering practice. This paper examines the potential of a support vector machine (SVM) for predicting the OCR of clays from piezocone penetration test data. SVM is a statistical learning theory based on a structural risk minimization principle that minimizes both error and weight terms. The five input variables used for the SVM model for prediction of OCR are the corrected cone resistance (qt), vertical total stress (sigmav), hydrostatic pore pressure (u0), pore pressure at the cone tip (u1), and the pore pressure just above the cone base (u2). Sensitivity analysis has been performed to investigate the relative importance of each of the input parameters. From the sensitivity analysis, it is clear that qt=primary in situ data influenced by OCR followed by sigmav, u0, u2, and u1. Comparison between SVM and some of the traditional interpretation methods is also presented. The results of this study have shown that the SVM approach has the potential to be a practical tool for determination of OCR.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The determination of settlement of shallow foundations on cohesionless soil is an important task in geotechnical engineering. Available methods for the determination of settlement are not reliable. In this study, the support vector machine (SVM), a novel type of learning algorithm based on statistical theory, has been used to predict the settlement of shallow foundations on cohesionless soil. SVM uses a regression technique by introducing an ε – insensitive loss function. A thorough sensitive analysis has been made to ascertain which parameters are having maximum influence on settlement. The study shows that SVM has the potential to be a useful and practical tool for prediction of settlement of shallow foundation on cohesionless soil.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, direct torque control (DTC) algorithms for a split-phase induction machine (SPIM) are established. An SPIM has two sets of three-phase stator windings, with a shift of thirty electrical degrees between them. The significant contributions of this paper are: 1) two new methods of DTC technique for an SPIM are developed, called Resultant Flux Control Method and Individual Flux Control Method and 2) advantages and disadvantages of both methods are discussed. High torque ripple is a disadvantage for three-phase DTC. It is found that torque ripple in an SPIM can be significantly reduced without increasing the switching frequency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

At the time of restoration transmission line switching is one of the major causes, which creates transient overvoltages. Though detailed Electro Magnetic Transient studies are carried out extensively for the planning and design of transmission systems, such studies are not common in a day-today operation of power systems. However it is important for the operator to ensure during restoration of supply that peak overvoltages resulting from the switching operations are well within safe limits. This paper presents a support vector machine approach to classify the various cases of line energization in the category of safe or unsafe based upon the peak value of overvoltage at the receiving end of line. Operator can define the threshold value of voltage to assign the data pattern in either of the class. For illustration of proposed approach the power system used for switching transient peak overvoltages tests is a 400 kV equivalent system of an Indian southern gri

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Statistical learning algorithms provide a viable framework for geotechnical engineering modeling. This paper describes two statistical learning algorithms applied for site characterization modeling based on standard penetration test (SPT) data. More than 2700 field SPT values (N) have been collected from 766 boreholes spread over an area of 220 sqkm area in Bangalore. To get N corrected value (N,), N values have been corrected (Ne) for different parameters such as overburden stress, size of borehole, type of sampler, length of connecting rod, etc. In three-dimensional site characterization model, the function N-c=N-c (X, Y, Z), where X, Y and Z are the coordinates of a point corresponding to N, value, is to be approximated in which N, value at any half-space point in Bangalore can be determined. The first algorithm uses least-square support vector machine (LSSVM), which is related to aridge regression type of support vector machine. The second algorithm uses relevance vector machine (RVM), which combines the strengths of kernel-based methods and Bayesian theory to establish the relationships between a set of input vectors and a desired output. The paper also presents the comparative study between the developed LSSVM and RVM model for site characterization. Copyright (C) 2009 John Wiley & Sons,Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study considers the scheduling problem observed in the burn-in operation of semiconductor final testing, where jobs are associated with release times, due dates, processing times, sizes, and non-agreeable release times and due dates. The burn-in oven is modeled as a batch-processing machine which can process a batch of several jobs as long as the total sizes of the jobs do not exceed the machine capacity and the processing time of a batch is equal to the longest time among all the jobs in the batch. Due to the importance of on-time delivery in semiconductor manufacturing, the objective measure of this problem is to minimize total weighted tardiness. We have formulated the scheduling problem into an integer linear programming model and empirically show its computational intractability. Due to the computational intractability, we propose a few simple greedy heuristic algorithms and meta-heuristic algorithm, simulated annealing (SA). A series of computational experiments are conducted to evaluate the performance of the proposed heuristic algorithms in comparison with exact solution on various small-size problem instances and in comparison with estimated optimal solution on various real-life large size problem instances. The computational results show that the SA algorithm, with initial solution obtained using our own proposed greedy heuristic algorithm, consistently finds a robust solution in a reasonable amount of computation time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Core Vector Machine(CVM) is suitable for efficient large-scale pattern classification. In this paper, a method for improving the performance of CVM with Gaussian kernel function irrespective of the orderings of patterns belonging to different classes within the data set is proposed. This method employs a selective sampling based training of CVM using a novel kernel based scalable hierarchical clustering algorithm. Empirical studies made on synthetic and real world data sets show that the proposed strategy performs well on large data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Separation of printed text blocks from the non-text areas, containing signatures, handwritten text, logos and other such symbols, is a necessary first step for an OCR involving printed text recognition. In the present work, we compare the efficacy of some feature-classifier combinations to carry out this separation task. We have selected length-nomalized horizontal projection profile (HPP) as the starting point of such a separation task. This is with the assumption that the printed text blocks contain lines of text which generate HPP's with some regularity. Such an assumption is demonstrated to be valid. Our features are the HPP and its two transformed versions, namely, eigen and Fisher profiles. Four well known classifiers, namely, Nearest neighbor, Linear discriminant function, SVM's and artificial neural networks have been considered and efficiency of the combination of these classifiers with the above features is compared. A sequential floating feature selection technique has been adopted to enhance the efficiency of this separation task. The results give an average accuracy of about 96.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Extraction of text areas from the document images with complex content and layout is one of the challenging tasks. Few texture based techniques have already been proposed for extraction of such text blocks. Most of such techniques are greedy for computation time and hence are far from being realizable for real time implementation. In this work, we propose a modification to two of the existing texture based techniques to reduce the computation. This is accomplished with Harris corner detectors. The efficiency of these two textures based algorithms, one based on Gabor filters and other on log-polar wavelet signature, are compared. A combination of Gabor feature based texture classification performed on a smaller set of Harris corner detected points is observed to deliver the accuracy and efficiency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper. we propose a novel method using wavelets as input to neural network self-organizing maps and support vector machine for classification of magnetic resonance (MR) images of the human brain. The proposed method classifies MR brain images as either normal or abnormal. We have tested the proposed approach using a dataset of 52 MR brain images. Good classification percentage of more than 94% was achieved using the neural network self-organizing maps (SOM) and 98% front support vector machine. We observed that the classification rate is high for a Support vector machine classifier compared to self-organizing map-based approach.