880 resultados para Automated segmentation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a method of automated segmentation of speech assuming the signal is continuously time varying rather than the traditional short time stationary model. It has been shown that this representation gives comparable if not marginally better results than the other techniques for automated segmentation. A formulation of the 'Bach' (music semitonal) frequency scale filter-bank is proposed. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks considering this model. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. 'Bach' filters are seen to marginally outperform the other filters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This correspondence describes a method for automated segmentation of speech. The method proposed in this paper uses a specially designed filter-bank called Bach filter-bank which makes use of 'music' related perception criteria. The speech signal is treated as continuously time varying signal as against a short time stationary model. A comparative study has been made of the performances using Mel, Bark and Bach scale filter banks. The preliminary results show up to 80 % matches within 20 ms of the manually segmented data, without any information of the content of the text and without any language dependence. The Bach filters are seen to marginally outperform the other filters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Myocardial perfusion quantification by means of Contrast-Enhanced Cardiac Magnetic Resonance images relies on time consuming frame-by-frame manual tracing of regions of interest. In this Thesis, a novel automated technique for myocardial segmentation and non-rigid registration as a basis for perfusion quantification is presented. The proposed technique is based on three steps: reference frame selection, myocardial segmentation and non-rigid registration. In the first step, the reference frame in which both endo- and epicardial segmentation will be performed is chosen. Endocardial segmentation is achieved by means of a statistical region-based level-set technique followed by a curvature-based regularization motion. Epicardial segmentation is achieved by means of an edge-based level-set technique followed again by a regularization motion. To take into account the changes in position, size and shape of myocardium throughout the sequence due to out of plane respiratory motion, a non-rigid registration algorithm is required. The proposed non-rigid registration scheme consists in a novel multiscale extension of the normalized cross-correlation algorithm in combination with level-set methods. The myocardium is then divided into standard segments. Contrast enhancement curves are computed measuring the mean pixel intensity of each segment over time, and perfusion indices are extracted from each curve. The overall approach has been tested on synthetic and real datasets. For validation purposes, the sequences have been manually traced by an experienced interpreter, and contrast enhancement curves as well as perfusion indices have been computed. Comparisons between automatically extracted and manually obtained contours and enhancement curves showed high inter-technique agreement. Comparisons of perfusion indices computed using both approaches against quantitative coronary angiography and visual interpretation demonstrated that the two technique have similar diagnostic accuracy. In conclusion, the proposed technique allows fast, automated and accurate measurement of intra-myocardial contrast dynamics, and may thus address the strong clinical need for quantitative evaluation of myocardial perfusion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE Quantification of retinal layers using automated segmentation of optical coherence tomography (OCT) images allows for longitudinal studies of retinal and neurological disorders in mice. The purpose of this study was to compare the performance of automated retinal layer segmentation algorithms with data from manual segmentation in mice using the Spectralis OCT. METHODS Spectral domain OCT images from 55 mice from three different mouse strains were analyzed in total. The OCT scans from 22 C57Bl/6, 22 BALBc, and 11 C3A.Cg-Pde6b(+)Prph2(Rd2) /J mice were automatically segmented using three commercially available automated retinal segmentation algorithms and compared to manual segmentation. RESULTS Fully automated segmentation performed well in mice and showed coefficients of variation (CV) of below 5% for the total retinal volume. However, all three automated segmentation algorithms yielded much thicker total retinal thickness values compared to manual segmentation data (P < 0.0001) due to segmentation errors in the basement membrane. CONCLUSIONS Whereas the automated retinal segmentation algorithms performed well for the inner layers, the retinal pigmentation epithelium (RPE) was delineated within the sclera, leading to consistently thicker measurements of the photoreceptor layer and the total retina. TRANSLATIONAL RELEVANCE The introduction of spectral domain OCT allows for accurate imaging of the mouse retina. Exact quantification of retinal layer thicknesses in mice is important to study layers of interest under various pathological conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The synapses in the cerebral cortex can be classified into two main types, Gray’s type I and type II, which correspond to asymmetric (mostly glutamatergic excitatory) and symmetric (inhibitory GABAergic) synapses, respectively. Hence, the quantification and identification of their different types and the proportions in which they are found, is extraordinarily important in terms of brain function. The ideal approach to calculate the number of synapses per unit volume is to analyze 3D samples reconstructed from serial sections. However, obtaining serial sections by transmission electron microscopy is an extremely time consuming and technically demanding task. Using focused ion beam/scanning electron microscope microscopy, we recently showed that virtually all synapses can be accurately identified as asymmetric or symmetric synapses when they are visualized, reconstructed, and quantified from large 3D tissue samples obtained in an automated manner. Nevertheless, the analysis, segmentation, and quantification of synapses is still a labor intensive procedure. Thus, novel solutions are currently necessary to deal with the large volume of data that is being generated by automated 3D electron microscopy. Accordingly, we have developed ESPINA, a software tool that performs the automated segmentation and counting of synapses in a reconstructed 3D volume of the cerebral cortex, and that greatly facilitates and accelerates these processes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Without knowledge of basic seafloor characteristics, the ability to address any number of critical marine and/or coastal management issues is diminished. For example, management and conservation of essential fish habitat (EFH), a requirement mandated by federally guided fishery management plans (FMPs), requires among other things a description of habitats for federally managed species. Although the list of attributes important to habitat are numerous, the ability to efficiently and effectively describe many, and especially at the scales required, does not exist with the tools currently available. However, several characteristics of seafloor morphology are readily obtainable at multiple scales and can serve as useful descriptors of habitat. Recent advancements in acoustic technology, such as multibeam echosounding (MBES), can provide remote indication of surficial sediment properties such as texture, hardness, or roughness, and further permit highly detailed renderings of seafloor morphology. With acoustic-based surveys providing a relatively efficient method for data acquisition, there exists a need for efficient and reproducible automated segmentation routines to process the data. Using MBES data collected by the Olympic Coast National Marine Sanctuary (OCNMS), and through a contracted seafloor survey, we expanded on the techniques of Cutter et al. (2003) to describe an objective repeatable process that uses parameterized local Fourier histogram (LFH) texture features to automate segmentation of surficial sediments from acoustic imagery using a maximum likelihood decision rule. Sonar signatures and classification performance were evaluated using video imagery obtained from a towed camera sled. Segmented raster images were converted to polygon features and attributed using a hierarchical deep-water marine benthic classification scheme (Greene et al. 1999) for use in a geographical information system (GIS). (PDF contains 41 pages.)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", or NGMI, which is used to segment Chinese documents into n-character words or phrases, using language statistics drawn from the Chinese Wikipedia corpus. The approach alleviates the tremendous effort that is required in preparing and maintaining the manually segmented Chinese text for training purposes, and manually maintaining ever expanding lexicons. Previously, mutual information was used to achieve automated segmentation into 2-character words. The NGMI approach extends the approach to handle longer n-character words. Experiments with heterogeneous documents from the Chinese Wikipedia collection show good results.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Histopathology is the clinical standard for tissue diagnosis. However, histopathology has several limitations including that it requires tissue processing, which can take 30 minutes or more, and requires a highly trained pathologist to diagnose the tissue. Additionally, the diagnosis is qualitative, and the lack of quantitation leads to possible observer-specific diagnosis. Taken together, it is difficult to diagnose tissue at the point of care using histopathology.

Several clinical situations could benefit from more rapid and automated histological processing, which could reduce the time and the number of steps required between obtaining a fresh tissue specimen and rendering a diagnosis. For example, there is need for rapid detection of residual cancer on the surface of tumor resection specimens during excisional surgeries, which is known as intraoperative tumor margin assessment. Additionally, rapid assessment of biopsy specimens at the point-of-care could enable clinicians to confirm that a suspicious lesion is successfully sampled, thus preventing an unnecessary repeat biopsy procedure. Rapid and low cost histological processing could also be potentially useful in settings lacking the human resources and equipment necessary to perform standard histologic assessment. Lastly, automated interpretation of tissue samples could potentially reduce inter-observer error, particularly in the diagnosis of borderline lesions.

To address these needs, high quality microscopic images of the tissue must be obtained in rapid timeframes, in order for a pathologic assessment to be useful for guiding the intervention. Optical microscopy is a powerful technique to obtain high-resolution images of tissue morphology in real-time at the point of care, without the need for tissue processing. In particular, a number of groups have combined fluorescence microscopy with vital fluorescent stains to visualize micro-anatomical features of thick (i.e. unsectioned or unprocessed) tissue. However, robust methods for segmentation and quantitative analysis of heterogeneous images are essential to enable automated diagnosis. Thus, the goal of this work was to obtain high resolution imaging of tissue morphology through employing fluorescence microscopy and vital fluorescent stains and to develop a quantitative strategy to segment and quantify tissue features in heterogeneous images, such as nuclei and the surrounding stroma, which will enable automated diagnosis of thick tissues.

To achieve these goals, three specific aims were proposed. The first aim was to develop an image processing method that can differentiate nuclei from background tissue heterogeneity and enable automated diagnosis of thick tissue at the point of care. A computational technique called sparse component analysis (SCA) was adapted to isolate features of interest, such as nuclei, from the background. SCA has been used previously in the image processing community for image compression, enhancement, and restoration, but has never been applied to separate distinct tissue types in a heterogeneous image. In combination with a high resolution fluorescence microendoscope (HRME) and a contrast agent acriflavine, the utility of this technique was demonstrated through imaging preclinical sarcoma tumor margins. Acriflavine localizes to the nuclei of cells where it reversibly associates with RNA and DNA. Additionally, acriflavine shows some affinity for collagen and muscle. SCA was adapted to isolate acriflavine positive features or APFs (which correspond to RNA and DNA) from background tissue heterogeneity. The circle transform (CT) was applied to the SCA output to quantify the size and density of overlapping APFs. The sensitivity of the SCA+CT approach to variations in APF size, density and background heterogeneity was demonstrated through simulations. Specifically, SCA+CT achieved the lowest errors for higher contrast ratios and larger APF sizes. When applied to tissue images of excised sarcoma margins, SCA+CT correctly isolated APFs and showed consistently increased density in tumor and tumor + muscle images compared to images containing muscle. Next, variables were quantified from images of resected primary sarcomas and used to optimize a multivariate model. The sensitivity and specificity for differentiating positive from negative ex vivo resected tumor margins was 82% and 75%. The utility of this approach was further tested by imaging the in vivo tumor cavities from 34 mice after resection of a sarcoma with local recurrence as a bench mark. When applied prospectively to images from the tumor cavity, the sensitivity and specificity for differentiating local recurrence was 78% and 82%. The results indicate that SCA+CT can accurately delineate APFs in heterogeneous tissue, which is essential to enable automated and rapid surveillance of tissue pathology.

Two primary challenges were identified in the work in aim 1. First, while SCA can be used to isolate features, such as APFs, from heterogeneous images, its performance is limited by the contrast between APFs and the background. Second, while it is feasible to create mosaics by scanning a sarcoma tumor bed in a mouse, which is on the order of 3-7 mm in any one dimension, it is not feasible to evaluate an entire human surgical margin. Thus, improvements to the microscopic imaging system were made to (1) improve image contrast through rejecting out-of-focus background fluorescence and to (2) increase the field of view (FOV) while maintaining the sub-cellular resolution needed for delineation of nuclei. To address these challenges, a technique called structured illumination microscopy (SIM) was employed in which the entire FOV is illuminated with a defined spatial pattern rather than scanning a focal spot, such as in confocal microscopy.

Thus, the second aim was to improve image contrast and increase the FOV through employing wide-field, non-contact structured illumination microscopy and optimize the segmentation algorithm for new imaging modality. Both image contrast and FOV were increased through the development of a wide-field fluorescence SIM system. Clear improvement in image contrast was seen in structured illumination images compared to uniform illumination images. Additionally, the FOV is over 13X larger than the fluorescence microendoscope used in aim 1. Initial segmentation results of SIM images revealed that SCA is unable to segment large numbers of APFs in the tumor images. Because the FOV of the SIM system is over 13X larger than the FOV of the fluorescence microendoscope, dense collections of APFs commonly seen in tumor images could no longer be sparsely represented, and the fundamental sparsity assumption associated with SCA was no longer met. Thus, an algorithm called maximally stable extremal regions (MSER) was investigated as an alternative approach for APF segmentation in SIM images. MSER was able to accurately segment large numbers of APFs in SIM images of tumor tissue. In addition to optimizing MSER for SIM image segmentation, an optimal frequency of the illumination pattern used in SIM was carefully selected because the image signal to noise ratio (SNR) is dependent on the grid frequency. A grid frequency of 31.7 mm-1 led to the highest SNR and lowest percent error associated with MSER segmentation.

Once MSER was optimized for SIM image segmentation and the optimal grid frequency was selected, a quantitative model was developed to diagnose mouse sarcoma tumor margins that were imaged ex vivo with SIM. Tumor margins were stained with acridine orange (AO) in aim 2 because AO was found to stain the sarcoma tissue more brightly than acriflavine. Both acriflavine and AO are intravital dyes, which have been shown to stain nuclei, skeletal muscle, and collagenous stroma. A tissue-type classification model was developed to differentiate localized regions (75x75 µm) of tumor from skeletal muscle and adipose tissue based on the MSER segmentation output. Specifically, a logistic regression model was used to classify each localized region. The logistic regression model yielded an output in terms of probability (0-100%) that tumor was located within each 75x75 µm region. The model performance was tested using a receiver operator characteristic (ROC) curve analysis that revealed 77% sensitivity and 81% specificity. For margin classification, the whole margin image was divided into localized regions and this tissue-type classification model was applied. In a subset of 6 margins (3 negative, 3 positive), it was shown that with a tumor probability threshold of 50%, 8% of all regions from negative margins exceeded this threshold, while over 17% of all regions exceeded the threshold in the positive margins. Thus, 8% of regions in negative margins were considered false positives. These false positive regions are likely due to the high density of APFs present in normal tissues, which clearly demonstrates a challenge in implementing this automatic algorithm based on AO staining alone.

Thus, the third aim was to improve the specificity of the diagnostic model through leveraging other sources of contrast. Modifications were made to the SIM system to enable fluorescence imaging at a variety of wavelengths. Specifically, the SIM system was modified to enabling imaging of red fluorescent protein (RFP) expressing sarcomas, which were used to delineate the location of tumor cells within each image. Initial analysis of AO stained panels confirmed that there was room for improvement in tumor detection, particularly in regards to false positive regions that were negative for RFP. One approach for improving the specificity of the diagnostic model was to investigate using a fluorophore that was more specific to staining tumor. Specifically, tetracycline was selected because it appeared to specifically stain freshly excised tumor tissue in a matter of minutes, and was non-toxic and stable in solution. Results indicated that tetracycline staining has promise for increasing the specificity of tumor detection in SIM images of a preclinical sarcoma model and further investigation is warranted.

In conclusion, this work presents the development of a combination of tools that is capable of automated segmentation and quantification of micro-anatomical images of thick tissue. When compared to the fluorescence microendoscope, wide-field multispectral fluorescence SIM imaging provided improved image contrast, a larger FOV with comparable resolution, and the ability to image a variety of fluorophores. MSER was an appropriate and rapid approach to segment dense collections of APFs from wide-field SIM images. Variables that reflect the morphology of the tissue, such as the density, size, and shape of nuclei and nucleoli, can be used to automatically diagnose SIM images. The clinical utility of SIM imaging and MSER segmentation to detect microscopic residual disease has been demonstrated by imaging excised preclinical sarcoma margins. Ultimately, this work demonstrates that fluorescence imaging of tissue micro-anatomy combined with a specialized algorithm for delineation and quantification of features is a means for rapid, non-destructive and automated detection of microscopic disease, which could improve cancer management in a variety of clinical scenarios.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND AND PURPOSE Reproducible segmentation of brain tumors on magnetic resonance images is an important clinical need. This study was designed to evaluate the reliability of a novel fully automated segmentation tool for brain tumor image analysis in comparison to manually defined tumor segmentations. METHODS We prospectively evaluated preoperative MR Images from 25 glioblastoma patients. Two independent expert raters performed manual segmentations. Automatic segmentations were performed using the Brain Tumor Image Analysis software (BraTumIA). In order to study the different tumor compartments, the complete tumor volume TV (enhancing part plus non-enhancing part plus necrotic core of the tumor), the TV+ (TV plus edema) and the contrast enhancing tumor volume CETV were identified. We quantified the overlap between manual and automated segmentation by calculation of diameter measurements as well as the Dice coefficients, the positive predictive values, sensitivity, relative volume error and absolute volume error. RESULTS Comparison of automated versus manual extraction of 2-dimensional diameter measurements showed no significant difference (p = 0.29). Comparison of automated versus manual segmentation of volumetric segmentations showed significant differences for TV+ and TV (p<0.05) but no significant differences for CETV (p>0.05) with regard to the Dice overlap coefficients. Spearman's rank correlation coefficients (ρ) of TV+, TV and CETV showed highly significant correlations between automatic and manual segmentations. Tumor localization did not influence the accuracy of segmentation. CONCLUSIONS In summary, we demonstrated that BraTumIA supports radiologists and clinicians by providing accurate measures of cross-sectional diameter-based tumor extensions. The automated volume measurements were comparable to manual tumor delineation for CETV tumor volumes, and outperformed inter-rater variability for overlap and sensitivity.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Introduction Diffusion weighted Imaging (DWI) techniques are able to measure, in vivo and non-invasively, the diffusivity of water molecules inside the human brain. DWI has been applied on cerebral ischemia, brain maturation, epilepsy, multiple sclerosis, etc. [1]. Nowadays, there is a very high availability of these images. DWI allows the identification of brain tissues, so its accurate segmentation is a common initial step for the referred applications. Materials and Methods We present a validation study on automated segmentation of DWI based on the Gaussian mixture and hidden Markov random field models. This methodology is widely solved with iterative conditional modes algorithm, but some studies suggest [2] that graph-cuts (GC) algorithms improve the results when initialization is not close to the final solution. We implemented a segmentation tool integrating ITK with a GC algorithm [3], and a validation software using fuzzy overlap measures [4]. Results Segmentation accuracy of each tool is tested against a gold-standard segmentation obtained from a T1 MPRAGE magnetic resonance image of the same subject, registered to the DWI space. The proposed software shows meaningful improvements by using the GC energy minimization approach on DTI and DSI (Diffusion Spectrum Imaging) data. Conclusions The brain tissues segmentation on DWI is a fundamental step on many applications. Accuracy and robustness improvements are achieved with the proposed software, with high impact on the application’s final result.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents an automated segmentation approach for MR images of the knee bones. The bones are the first stage of a segmentation system for the knee, primarily aimed at the automated segmentation of the cartilages. The segmentation is performed using 3D active shape models (ASM), which are initialized using an affine registration to an atlas. The 3D ASMs of the bones are created automatically using a point distribution model optimization scheme. The accuracy and robustness of the segmentation approach was experimentally validated using an MR database of fat suppressed spoiled gradient recall images.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Automobiles have deeply impacted the way in which we travel but they have also contributed to many deaths and injury due to crashes. A number of reasons for these crashes have been pointed out by researchers. Inexperience has been identified as a contributing factor to road crashes. Driver’s driving abilities also play a vital role in judging the road environment and reacting in-time to avoid any possible collision. Therefore driver’s perceptual and motor skills remain the key factors impacting on road safety. Our failure to understand what is really important for learners, in terms of competent driving, is one of the many challenges for building better training programs. Driver training is one of the interventions aimed at decreasing the number of crashes that involve young drivers. Currently, there is a need to develop comprehensive driver evaluation system that benefits from the advances in Driver Assistance Systems. A multidisciplinary approach is necessary to explain how driving abilities evolves with on-road driving experience. To our knowledge, driver assistance systems have never been comprehensively used in a driver training context to assess the safety aspect of driving. The aim and novelty of this thesis is to develop and evaluate an Intelligent Driver Training System (IDTS) as an automated assessment tool that will help drivers and their trainers to comprehensively view complex driving manoeuvres and potentially provide effective feedback by post processing the data recorded during driving. This system is designed to help driver trainers to accurately evaluate driver performance and has the potential to provide valuable feedback to the drivers. Since driving is dependent on fuzzy inputs from the driver (i.e. approximate distance calculation from the other vehicles, approximate assumption of the other vehicle speed), it is necessary that the evaluation system is based on criteria and rules that handles uncertain and fuzzy characteristics of the driving tasks. Therefore, the proposed IDTS utilizes fuzzy set theory for the assessment of driver performance. The proposed research program focuses on integrating the multi-sensory information acquired from the vehicle, driver and environment to assess driving competencies. After information acquisition, the current research focuses on automated segmentation of the selected manoeuvres from the driving scenario. This leads to the creation of a model that determines a “competency” criterion through the driving performance protocol used by driver trainers (i.e. expert knowledge) to assess drivers. This is achieved by comprehensively evaluating and assessing the data stream acquired from multiple in-vehicle sensors using fuzzy rules and classifying the driving manoeuvres (i.e. overtake, lane change, T-crossing and turn) between low and high competency. The fuzzy rules use parameters such as following distance, gaze depth and scan area, distance with respect to lanes and excessive acceleration or braking during the manoeuvres to assess competency. These rules that identify driving competency were initially designed with the help of expert’s knowledge (i.e. driver trainers). In-order to fine tune these rules and the parameters that define these rules, a driving experiment was conducted to identify the empirical differences between novice and experienced drivers. The results from the driving experiment indicated that significant differences existed between novice and experienced driver, in terms of their gaze pattern and duration, speed, stop time at the T-crossing, lane keeping and the time spent in lanes while performing the selected manoeuvres. These differences were used to refine the fuzzy membership functions and rules that govern the assessments of the driving tasks. Next, this research focused on providing an integrated visual assessment interface to both driver trainers and their trainees. By providing a rich set of interactive graphical interfaces, displaying information about the driving tasks, Intelligent Driver Training System (IDTS) visualisation module has the potential to give empirical feedback to its users. Lastly, the validation of the IDTS system’s assessment was conducted by comparing IDTS objective assessments, for the driving experiment, with the subjective assessments of the driver trainers for particular manoeuvres. Results show that not only IDTS was able to match the subjective assessments made by driver trainers during the driving experiment but also identified some additional driving manoeuvres performed in low competency that were not identified by the driver trainers due to increased mental workload of trainers when assessing multiple variables that constitute driving. The validation of IDTS emphasized the need for an automated assessment tool that can segment the manoeuvres from the driving scenario, further investigate the variables within that manoeuvre to determine the manoeuvre’s competency and provide integrated visualisation regarding the manoeuvre to its users (i.e. trainers and trainees). Through analysis and validation it was shown that IDTS is a useful assistance tool for driver trainers to empirically assess and potentially provide feedback regarding the manoeuvres undertaken by the drivers.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Purpose To examine macular retinal thickness and retinal layer thickness with spectral domain optical coherence tomography (OCT) in a population of children with normal ocular health and minimal refractive errors. Methods High resolution macular OCT scans from 196 children aged from 4 to 12 years (mean age 8 ± 2 years) were analysed to determine total retinal thickness and the thickness of 6 different retinal layers across the central 5 mm of the posterior pole. Automated segmentation with manual correction was used to derive retinal thickness values. Results The mean total retinal thickness in the central 1 mm foveal zone was 255 ± 16 μm, and this increased significantly with age (mean increase of 1.8 microns per year) in childhood (p<0.001). Age-related increases in thickness of some retinal layers were also observed, with changes of highest statistical significance found in the outer retinal layers in the central foveal region (p<0.01). Significant topographical variations in thickness of each of the retinal layers were also observed (p<0.001). Conclusions Small magnitude, statistically significant increases in total retinal thickness and retinal layer thickness occur from early childhood to adolescence. The most prominent changes appear to occur in the outer retinal layers of the central fovea.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We have benchmarked the maximum obtainable recognition accuracy on five publicly available standard word image data sets using semi-automated segmentation and a commercial OCR. These images have been cropped from camera captured scene images, born digital images (BDI) and street view images. Using the Matlab based tool developed by us, we have annotated at the pixel level more than 3600 word images from the five data sets. The word images binarized by the tool, as well as by our own midline analysis and propagation of segmentation (MAPS) algorithm are recognized using the trial version of Nuance Omnipage OCR and these two results are compared with the best reported in the literature. The benchmark word recognition rates obtained on ICDAR 2003, Sign evaluation, Street view, Born-digital and ICDAR 2011 data sets are 83.9%, 89.3%, 79.6%, 88.5% and 86.7%, respectively. The results obtained from MAPS binarized word images without the use of any lexicon are 64.5% and 71.7% for ICDAR 2003 and 2011 respectively, and these values are higher than the best reported values in the literature of 61.1% and 41.2%, respectively. MAPS results of 82.8% for BDI 2011 dataset matches the performance of the state of the art method based on power law transform.