994 resultados para Hierarchical document


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report a hierarchical blind script identifier for 11 different Indian scripts. An initial grouping of the 11 scripts is accomplished at the first level of this hierarchy. At the subsequent level, we recognize the script in each group. The various nodes of this tree use different feature-classifier combinations. A database of 20,000 words of different font styles and sizes is collected and used for each script. Effectiveness of Gabor and Discrete Cosine Transform features has been independently, evaluated using nearest neighbor linear discriminant and support vector machine classifiers. The minimum and maximum accuracies obtained, using this hierarchical mechanism, are 92.2% and 97.6%, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a new feature-based approach for mosaicing of camera-captured document images. A novel block-based scheme is employed to ensure that corners can be reliably detected over a wide range of images. 2-D discrete cosine transform is computed for image blocks defined around each of the detected corners and a small subset of the coefficients is used as a feature vector A 2-pass feature matching is performed to establish point correspondences from which the homography relating the input images could be computed. The algorithm is tested on a number of complex document images casually taken from a hand-held camera yielding convincing results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Skew correction of complex document images is a difficult task. We propose an edge-based connected component approach for robust skew correction of documents with complex layout and content. The algorithm essentially consists of two steps - an 'initialization' step to determine the image orientation from the centroids of the connected components and a 'search' step to find the actual skew of the image. During initialization, we choose two different sets of points regularly spaced across the the image, one from the left to right and the other from top to bottom. The image orientation is determined from the slope between the two succesive nearest neighbors of each of the points in the chosen set. The search step finds succesive nearest neighbors that satisfy the parameters obtained in the initialization step. The final skew is determined from the slopes obtained in the 'search' step. Unlike other connected component based methods, the proposed method does not require any binarization step that generally precedes connected component analysis. The method works well for scanned documents with complex layout of any skew with a precision of 0.5 degrees.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The document images that are fed into an Optical Character Recognition system, might be skewed. This could be due to improper feeding of the document into the scanner or may be due to a faulty scanner. In this paper, we propose a skew detection and correction method for document images. We make use of the inherent randomness in the Horizontal Projection profiles of a text block image, as the skew of the image varies. The proposed algorithm has proved to be very robust and time efficient. The entire process takes less than a second on a 2.4 GHz Pentium IV PC.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Network Interfaces (NIs) are used in Multiprocessor System-on-Chips (MPSoCs) to connect CPUs to a packet switched Network-on-Chip. In this work we introduce a new NI architecture for our hierarchical CoreVA-MPSoC. The CoreVA-MPSoC targets streaming applications in embedded systems. The main contribution of this paper is a system-level analysis of different NI configurations, considering both software and hardware costs for NoC communication. Different configurations of the NI are compared using a benchmark suite of 10 streaming applications. The best performing NI configuration shows an average speedup of 20 for a CoreVA-MPSoC with 32 CPUs compared to a single CPU. Furthermore, we present physical implementation results using a 28 nm FD-SOI standard cell technology. A hierarchical MPSoC with 8 CPU clusters and 4 CPUs in each cluster running at 800MHz requires an area of 4.56mm2.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel programming and effective partitioning of applications for embedded many-core architectures requires optimization algorithms. However, these algorithms have to quickly evaluate thousands of different partitions. We present a fast performance estimator embedded in a parallelizing compiler for streaming applications. The estimator combines a single execution-based simulation and an analytic approach. Experimental results demonstrate that the estimator has a mean error of 2.6% and computes its estimation 2848 times faster compared to a cycle accurate simulator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electronic document management (EDM) technology has the potential to enhance the information management in construction projects considerably, without radical changes to current practice. Over the past fifteen years this topic has been overshadowed by building product modelling in the construction IT research world, but at present EDM is quickly being introduced in practice, in particular in bigger projects. Often this is done in the form of third party services available over the World Wide Web. In the paper, a typology of research questions and methods is presented, which can be used to position the individual research efforts which are surveyed in the paper. Questions dealt with include: What features should EMD systems have? How much are they used? Are there benefits from use and how should these be measured? What are the barriers to wide-spread adoption? Which technical questions need to be solved? Is there scope for standardisation? How will the market for such systems evolve?

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Triggered by the very quick proliferation of Internet connectivity, electronic document management (EDM) systems are now rapidly being adopted for managing the documentation that is produced and exchanged in construction projects. Nevertheless there are still substantial barriers to the efficient use of such systems, mainly of a psychological nature and related to insufficient training. This paper presents the results of empirical studies carried out during 2002 concerning the current usage of EDM systems in the Finnish construction industry. The studies employed three different methods in order to provide a multifaceted view of the problem area, both on the industry and individual project level. In order to provide an accurate measurement of overall usage volume in the industry as a whole telephone interviews with key personnel from 100 randomly chosen construction projects were conducted. The interviews showed that while around 1/3 of big projects already have adopted the use of EDM, very few small projects have adopted this technology. The barriers to introduction were investigated through interviews with representatives for half a dozen of providers of systems and ASP-services. These interviews shed a lot of light on the dynamics of the market for this type of services and illustrated the diversity of business strategies adopted by vendors. In the final study log files from a project which had used an EDM system were analysed in order to determine usage patterns. The results illustrated that use is yet incomplete in coverage and that only a part of the individuals involved in the project used the system efficiently, either as information producers or consumers. The study also provided feedback on the usefulness of the log files.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Separation of printed text blocks from the non-text areas, containing signatures, handwritten text, logos and other such symbols, is a necessary first step for an OCR involving printed text recognition. In the present work, we compare the efficacy of some feature-classifier combinations to carry out this separation task. We have selected length-nomalized horizontal projection profile (HPP) as the starting point of such a separation task. This is with the assumption that the printed text blocks contain lines of text which generate HPP's with some regularity. Such an assumption is demonstrated to be valid. Our features are the HPP and its two transformed versions, namely, eigen and Fisher profiles. Four well known classifiers, namely, Nearest neighbor, Linear discriminant function, SVM's and artificial neural networks have been considered and efficiency of the combination of these classifiers with the above features is compared. A sequential floating feature selection technique has been adopted to enhance the efficiency of this separation task. The results give an average accuracy of about 96.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Extraction of text areas from the document images with complex content and layout is one of the challenging tasks. Few texture based techniques have already been proposed for extraction of such text blocks. Most of such techniques are greedy for computation time and hence are far from being realizable for real time implementation. In this work, we propose a modification to two of the existing texture based techniques to reduce the computation. This is accomplished with Harris corner detectors. The efficiency of these two textures based algorithms, one based on Gabor filters and other on log-polar wavelet signature, are compared. A combination of Gabor feature based texture classification performed on a smaller set of Harris corner detected points is observed to deliver the accuracy and efficiency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conformance testing focuses on checking whether an implementation. under test (IUT) behaves according to its specification. Typically, testers are interested it? performing targeted tests that exercise certain features of the IUT This intention is formalized as a test purpose. The tester needs a "strategy" to reach the goal specified by the test purpose. Also, for a particular test case, the strategy should tell the tester whether the IUT has passed, failed. or deviated front the test purpose. In [8] Jeron and Morel show how to compute, for a given finite state machine specification and a test purpose automaton, a complete test graph (CTG) which represents all test strategies. In this paper; we consider the case when the specification is a hierarchical state machine and show how to compute a hierarchical CTG which preserves the hierarchical structure of the specification. We also propose an algorithm for an online test oracle which avoids a space overhead associated with the CTG.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses a method for scaling SVM with Gaussian kernel function to handle large data sets by using a selective sampling strategy for the training set. It employs a scalable hierarchical clustering algorithm to construct cluster indexing structures of the training data in the kernel induced feature space. These are then used for selective sampling of the training data for SVM to impart scalability to the training process. Empirical studies made on real world data sets show that the proposed strategy performs well on large data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a novel dexterous technique for fast and accurate recognition of online handwritten Kannada and Tamil characters. Based on the primary classifier output and prior knowledge, the best classifier is chosen from set of three classifiers for second stage classification. Prior knowledge is obtained through analysis of the confusion matrix of primary classifier which helped in identifying the multiple sets of confused characters. Further, studies were carried out to check the performance of secondary classifiers in disambiguating among the confusion sets. Using this technique we have achieved an average accuracy of 92.6% for Kannada characters on the MILE lab dataset and 90.2% for Tamil characters on the HP Labs dataset.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe the design of a directory-based shared memory architecture on a hierarchical network of hypercubes. The distributed directory scheme comprises two separate hierarchical networks for handling cache requests and transfers. Further, the scheme assumes a single address space and each processing element views the entire network as contiguous memory space. The size of individual directories stored at each node of the network remains constant throughout the network. Although the size of the directory increases with the network size, the architecture is scalable. The results of the analytical studies demonstrate superior performance characteristics of our scheme compared with those of other schemes.

Relevância:

20.00% 20.00%

Publicador: