29 resultados para medical image segmentation
Resumo:
We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human actions like walking, running, jogging, boxing, hand-waving etc. Since human silhouette can be modeled by a set of rectangles, the elementary human actions can be modeled as a sequence of a set of rectangles with different orientations and scales. The activity segmentation is based on Gabor filter-bank features and normalized spectral clustering. The feature trajectories of an action category are learnt from training example videos using dynamic time warping. The combined segmentation and the recognition processes are very efficient as both the algorithms share the same framework and Gabor features computed for the former can be used for the later. We have also proposed a simple shadow detection technique to extract good silhouette which is necessary for good accuracy of an action recognition technique.
Resumo:
In this paper the approach for automatic road extraction for an urban region using structural, spectral and geometric characteristics of roads has been presented. Roads have been extracted based on two levels: Pre-processing and road extraction methods. Initially, the image is pre-processed to improve the tolerance by reducing the clutter (that mostly represents the buildings, parking lots, vegetation regions and other open spaces). The road segments are then extracted using Texture Progressive Analysis (TPA) and Normalized cut algorithm. The TPA technique uses binary segmentation based on three levels of texture statistical evaluation to extract road segments where as, Normalizedcut method for road extraction is a graph based method that generates optimal partition of road segments. The performance evaluation (quality measures) for road extraction using TPA and normalized cut method is compared. Thus the experimental result show that normalized cut method is efficient in extracting road segments in urban region from high resolution satellite image.
Resumo:
Denoising of medical images in wavelet domain has potential application in transmission technologies such as teleradiology. This technique becomes all the more attractive when we consider the progressive transmission in a teleradiology system. The transmitted images are corrupted mainly due to noisy channels. In this paper, we present a new real time image denoising scheme based on limited restoration of bit-planes of wavelet coefficients. The proposed scheme exploits the fundamental property of wavelet transform - its ability to analyze the image at different resolution levels and the edge information associated with each sub-band. The desired bit-rate control is achieved by applying the restoration on a limited number of bit-planes subject to the optimal smoothing. The proposed method adapts itself to the preference of the medical expert; a single parameter can be used to balance the preservation of (expert-dependent) relevant details against the degree of noise reduction. The proposed scheme relies on the fact that noise commonly manifests itself as a fine-grained structure in image and wavelet transform allows the restoration strategy to adapt itself according to directional features of edges. The proposed approach shows promising results when compared with unrestored case, in context of error reduction. It also has capability to adapt to situations where noise level in the image varies and with the changing requirements of medical-experts. The applicability of the proposed approach has implications in restoration of medical images in teleradiology systems. The proposed scheme is computationally efficient.
Resumo:
Purpose: Fast reconstruction of interior optical parameter distribution using a new approach called Broyden-based model iterative image reconstruction (BMOBIIR) and adjoint Broyden-based MOBIIR (ABMOBIIR) of a tissue and a tissue mimicking phantom from boundary measurement data in diffuse optical tomography (DOT). Methods: DOT is a nonlinear and ill-posed inverse problem. Newton-based MOBIIR algorithm, which is generally used, requires repeated evaluation of the Jacobian which consumes bulk of the computation time for reconstruction. In this study, we propose a Broyden approach-based accelerated scheme for Jacobian computation and it is combined with conjugate gradient scheme (CGS) for fast reconstruction. The method makes explicit use of secant and adjoint information that can be obtained from forward solution of the diffusion equation. This approach reduces the computational time many fold by approximating the system Jacobian successively through low-rank updates. Results: Simulation studies have been carried out with single as well as multiple inhomogeneities. Algorithms are validated using an experimental study carried out on a pork tissue with fat acting as an inhomogeneity. The results obtained through the proposed BMOBIIR and ABMOBIIR approaches are compared with those of Newton-based MOBIIR algorithm. The mean squared error and execution time are used as metrics for comparing the results of reconstruction. Conclusions: We have shown through experimental and simulation studies that Broyden-based MOBIIR and adjoint Broyden-based methods are capable of reconstructing single as well as multiple inhomogeneities in tissue and a tissue-mimicking phantom. Broyden MOBIIR and adjoint Broyden MOBIIR methods are computationally simple and they result in much faster implementations because they avoid direct evaluation of Jacobian. The image reconstructions have been carried out with different initial values using Newton, Broyden, and adjoint Broyden approaches. These algorithms work well when the initial guess is close to the true solution. However, when initial guess is far away from true solution, Newton-based MOBIIR gives better reconstructed images. The proposed methods are found to be stable with noisy measurement data. (C) 2011 American Association of Physicists in Medicine. DOI: 10.1118/1.3531572]
Resumo:
Diffuse optical tomography (DOT) is one of the ways to probe highly scattering media such as tissue using low-energy near infra-red light (NIR) to reconstruct a map of the optical property distribution. The interaction of the photons in biological tissue is a non-linear process and the phton transport through the tissue is modelled using diffusion theory. The inversion problem is often solved through iterative methods based on nonlinear optimization for the minimization of a data-model misfit function. The solution of the non-linear problem can be improved by modeling and optimizing the cost functional. The cost functional is f(x) = x(T)Ax - b(T)x + c and after minimization, the cost functional reduces to Ax = b. The spatial distribution of optical parameter can be obtained by solving the above equation iteratively for x. As the problem is non-linear, ill-posed and ill-conditioned, there will be an error or correction term for x at each iteration. A linearization strategy is proposed for the solution of the nonlinear ill-posed inverse problem by linear combination of system matrix and error in solution. By propagating the error (e) information (obtained from previous iteration) to the minimization function f(x), we can rewrite the minimization function as f(x; e) = (x + e)(T) A(x + e) - b(T)(x + e) + c. The revised cost functional is f(x; e) = f(x) + e(T)Ae. The self guided spatial weighted prior (e(T)Ae) error (e, error in estimating x) information along the principal nodes facilitates a well resolved dominant solution over the region of interest. The local minimization reduces the spreading of inclusion and removes the side lobes, thereby improving the contrast, localization and resolution of reconstructed image which has not been possible with conventional linear and regularization algorithm.
Resumo:
Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.
Resumo:
We have benchmarked the maximum obtainable recognition accuracy on five publicly available standard word image data sets using semi-automated segmentation and a commercial OCR. These images have been cropped from camera captured scene images, born digital images (BDI) and street view images. Using the Matlab based tool developed by us, we have annotated at the pixel level more than 3600 word images from the five data sets. The word images binarized by the tool, as well as by our own midline analysis and propagation of segmentation (MAPS) algorithm are recognized using the trial version of Nuance Omnipage OCR and these two results are compared with the best reported in the literature. The benchmark word recognition rates obtained on ICDAR 2003, Sign evaluation, Street view, Born-digital and ICDAR 2011 data sets are 83.9%, 89.3%, 79.6%, 88.5% and 86.7%, respectively. The results obtained from MAPS binarized word images without the use of any lexicon are 64.5% and 71.7% for ICDAR 2003 and 2011 respectively, and these values are higher than the best reported values in the literature of 61.1% and 41.2%, respectively. MAPS results of 82.8% for BDI 2011 dataset matches the performance of the state of the art method based on power law transform.
Resumo:
In order to reduce the motion artifacts in DSA, non-rigid image registration is commonly used before subtracting the mask from the contrast image. Since DSA registration requires a set of spatially non-uniform control points, a conventional MRF model is not very efficient. In this paper, we introduce the concept of pivotal and non-pivotal control points to address this, and propose a non-uniform MRF for DSA registration. We use quad-trees in a novel way to generate the non-uniform grid of control points. Our MRF formulation produces a smooth displacement field and therefore results in better artifact reduction than that of registering the control points independently. We achieve improved computational performance using pivotal control points without compromising on the artifact reduction. We have tested our approach using several clinical data sets, and have presented the results of quantitative analysis, clinical assessment and performance improvement on a GPU. (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, we report a breakthrough result on the difficult task of segmentation and recognition of coloured text from the word image dataset of ICDAR robust reading competition challenge 2: reading text in scene images. We split the word image into individual colour, gray and lightness planes and enhance the contrast of each of these planes independently by a power-law transform. The discrimination factor of each plane is computed as the maximum between-class variance used in Otsu thresholding. The plane that has maximum discrimination factor is selected for segmentation. The trial version of Omnipage OCR is then used on the binarized words for recognition. Our recognition results on ICDAR 2011 and ICDAR 2003 word datasets are compared with those reported in the literature. As baseline, the images binarized by simple global and local thresholding techniques were also recognized. The word recognition rate obtained by our non-linear enhancement and selection of plance method is 72.8% and 66.2% for ICDAR 2011 and 2003 word datasets, respectively. We have created ground-truth for each image at the pixel level to benchmark these datasets using a toolkit developed by us. The recognition rate of benchmarked images is 86.7% and 83.9% for ICDAR 2011 and 2003 datasets, respectively.
Resumo:
Electrical Impedance Tomography (EIT) is a computerized medical imaging technique which reconstructs the electrical impedance images of a domain under test from the boundary voltage-current data measured by an EIT electronic instrumentation using an image reconstruction algorithm. Being a computed tomography technique, EIT injects a constant current to the patient's body through the surface electrodes surrounding the domain to be imaged (Omega) and tries to calculate the spatial distribution of electrical conductivity or resistivity of the closed conducting domain using the potentials developed at the domain boundary (partial derivative Omega). Practical phantoms are essentially required to study, test and calibrate a medical EIT system for certifying the system before applying it on patients for diagnostic imaging. Therefore, the EIT phantoms are essentially required to generate boundary data for studying and assessing the instrumentation and inverse solvers a in EIT. For proper assessment of an inverse solver of a 2D EIT system, a perfect 2D practical phantom is required. As the practical phantoms are the assemblies of the objects with 3D geometries, the developing of a practical 2D-phantom is a great challenge and therefore, the boundary data generated from the practical phantoms with 3D geometry are found inappropriate for assessing a 2D inverse solver. Furthermore, the boundary data errors contributed by the instrumentation are also difficult to separate from the errors developed by the 3D phantoms. Hence, the errorless boundary data are found essential to assess the inverse solver in 2D EIT. In this direction, a MatLAB-based Virtual Phantom for 2D EIT (MatVP2DEIT) is developed to generate accurate boundary data for assessing the 2D-EIT inverse solvers and the image reconstruction accuracy. MatVP2DEIT is a MatLAB-based computer program which simulates a phantom in computer and generates the boundary potential data as the outputs by using the combinations of different phantom parameters as the inputs to the program. Phantom diameter, inhomogeneity geometry (shape, size and position), number of inhomogeneities, applied current magnitude, background resistivity, inhomogeneity resistivity all are set as the phantom variables which are provided as the input parameters to the MatVP2DEIT for simulating different phantom configurations. A constant current injection is simulated at the phantom boundary with different current injection protocols and boundary potential data are calculated. Boundary data sets are generated with different phantom configurations obtained with the different combinations of the phantom variables and the resistivity images are reconstructed using EIDORS. Boundary data of the virtual phantoms, containing inhomogeneities with complex geometries, are also generated for different current injection patterns using MatVP2DEIT and the resistivity imaging is studied. The effect of regularization method on the image reconstruction is also studied with the data generated by MatVP2DEIT. Resistivity images are evaluated by studying the resistivity parameters and contrast parameters estimated from the elemental resistivity profiles of the reconstructed phantom domain. Results show that the MatVP2DEIT generates accurate boundary data for different types of single or multiple objects which are efficient and accurate enough to reconstruct the resistivity images in EIDORS. The spatial resolution studies show that, the resistivity imaging conducted with the boundary data generated by MatVP2DEIT with 2048 elements, can reconstruct two circular inhomogeneities placed with a minimum distance (boundary to boundary) of 2 mm. It is also observed that, in MatVP2DEIT with 2048 elements, the boundary data generated for a phantom with a circular inhomogeneity of a diameter less than 7% of that of the phantom domain can produce resistivity images in EIDORS with a 1968 element mesh. Results also show that the MatVP2DEIT accurately generates the boundary data for neighbouring, opposite reference and trigonometric current patterns which are very suitable for resistivity reconstruction studies. MatVP2DEIT generated data are also found suitable for studying the effect of the different regularization methods on reconstruction process. Comparing the reconstructed image with an original geometry made in MatVP2DEIT, it would be easier to study the resistivity imaging procedures as well as the inverse solver performance. Using the proposed MatVP2DEIT software with modified domains, the cross sectional anatomy of a number of body parts can be simulated in PC and the impedance image reconstruction of human anatomy can be studied.
Resumo:
In this paper, we propose a technique for video object segmentation using patch seams across frames. Typically, seams, which are connected paths of low energy, are utilised for retargeting, where the primary aim is to reduce the image size while preserving the salient image contents. Here, we adapt the formulation of seams for temporal label propagation. The energy function associated with the proposed video seams provides temporal linking of patches across frames, to accurately segment the object. The proposed energy function takes into account the similarity of patches along the seam, temporal consistency of motion and spatial coherency of seams. Label propagation is achieved with high fidelity in the critical boundary regions, utilising the proposed patch seams. To achieve this without additional overheads, we curtail the error propagation by formulating boundary regions as rough-sets. The proposed approach out-perform state-of-the-art supervised and unsupervised algorithms, on benchmark datasets.
Resumo:
In optical character recognition of very old books, the recognition accuracy drops mainly due to the merging or breaking of characters. In this paper, we propose the first algorithm to segment merged Kannada characters by using a hypothesis to select the positions to be cut. This method searches for the best possible positions to segment, by taking into account the support vector machine classifier's recognition score and the validity of the aspect ratio (width to height ratio) of the segments between every pair of cut positions. The hypothesis to select the cut position is based on the fact that a concave surface exists above and below the touching portion. These concave surfaces are noted down by tracing the valleys in the top contour of the image and similarly doing it for the image rotated upside-down. The cut positions are then derived as closely matching valleys of the original and the rotated images. Our proposed segmentation algorithm works well for different font styles, shapes and sizes better than the existing vertical projection profile based segmentation. The proposed algorithm has been tested on 1125 different word images, each containing multiple merged characters, from an old Kannada book and 89.6% correct segmentation is achieved and the character recognition accuracy of merged words is 91.2%. A few points of merge are still missed due to the absence of a matched valley due to the specific shapes of the particular characters meeting at the merges.
Resumo:
Purpose: A prior image based temporally constrained reconstruction ( PITCR) algorithm was developed for obtaining accurate temperature maps having better volume coverage, and spatial, and temporal resolution than other algorithms for highly undersampled data in magnetic resonance (MR) thermometry. Methods: The proposed PITCR approach is an algorithm that gives weight to the prior image and performs accurate reconstruction in a dynamic imaging environment. The PITCR method is compared with the temporally constrained reconstruction (TCR) algorithm using pork muscle data. Results: The PITCR method provides superior performance compared to the TCR approach with highly undersampled data. The proposed approach is computationally expensive compared to the TCR approach, but this could be overcome by the advantage of reconstructing with fewer measurements. In the case of reconstruction of temperature maps from 16% of fully sampled data, the PITCR approach was 1.57x slower compared to the TCR approach, while the root mean square error using PITCR is 0.784 compared to 2.815 with the TCR scheme. Conclusions: The PITCR approach is able to perform more accurate reconstructions of temperature maps compared to the TCR approach with highly undersampled data in MR guided high intensity focused ultrasound. (C) 2015 American Association of Physicists in Medicine.
Resumo:
Crowd flow segmentation is an important step in many video surveillance tasks. In this work, we propose an algorithm for segmenting flows in H.264 compressed videos in a completely unsupervised manner. Our algorithm works on motion vectors which can be obtained by partially decoding the compressed video without extracting any additional features. Our approach is based on modelling the motion vector field as a Conditional Random Field (CRF) and obtaining oriented motion segments by finding the optimal labelling which minimises the global energy of CRF. These oriented motion segments are recursively merged based on gradient across their boundaries to obtain the final flow segments. This work in compressed domain can be easily extended to pixel domain by substituting motion vectors with motion based features like optical flow. The proposed algorithm is experimentally evaluated on a standard crowd flow dataset and its superior performance in both accuracy and computational time are demonstrated through quantitative results.