981 resultados para Unsupervised segmentation method
Resumo:
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^
Resumo:
This dissertation introduces a new system for handwritten text recognition based on an improved neural network design. Most of the existing neural networks treat mean square error function as the standard error function. The system as proposed in this dissertation utilizes the mean quartic error function, where the third and fourth derivatives are non-zero. Consequently, many improvements on the training methods were achieved. The training results are carefully assessed before and after the update. To evaluate the performance of a training system, there are three essential factors to be considered, and they are from high to low importance priority: (1) error rate on testing set, (2) processing time needed to recognize a segmented character and (3) the total training time and subsequently the total testing time. It is observed that bounded training methods accelerate the training process, while semi-third order training methods, next-minimal training methods, and preprocessing operations reduce the error rate on the testing set. Empirical observations suggest that two combinations of training methods are needed for different case character recognition. Since character segmentation is required for word and sentence recognition, this dissertation provides also an effective rule-based segmentation method, which is different from the conventional adaptive segmentation methods. Dictionary-based correction is utilized to correct mistakes resulting from the recognition and segmentation phases. The integration of the segmentation methods with the handwritten character recognition algorithm yielded an accuracy of 92% for lower case characters and 97% for upper case characters. In the testing phase, the database consists of 20,000 handwritten characters, with 10,000 for each case. The testing phase on the recognition 10,000 handwritten characters required 8.5 seconds in processing time.
Resumo:
Purpose: Custom cranio-orbital implants have been shown to achieve better performance than their hand-shaped counterparts by restoring skull anatomy more accurately and by reducing surgery time. Designing a custom implant involves reconstructing a model of the patient's skull using their computed tomography (CT) scan. The healthy side of the skull model, contralateral to the damaged region, can then be used to design an implant plan. Designing implants for areas of thin bone, such as the orbits, is challenging due to poor CT resolution of bone structures. This makes preoperative design time-intensive since thin bone structures in CT data must be manually segmented. The objective of this thesis was to research methods to accurately and efficiently design cranio-orbital implant plans, with a focus on the orbits, and to develop software that integrates these methods. Methods: The software consists of modules that use image and surface restoration approaches to enhance both the quality of CT data and the reconstructed model. It enables users to input CT data, and use tools to output a skull model with restored anatomy. The skull model can then be used to design the implant plan. The software was designed using 3D Slicer, an open-source medical visualization platform. It was tested on CT data from thirteen patients. Results: The average time it took to create a skull model with restored anatomy using our software was 0.33 hours ± 0.04 STD. In comparison, the design time of the manual segmentation method took between 3 and 6 hours. To assess the structural accuracy of the reconstructed models, CT data from the thirteen patients was used to compare the models created using our software with those using the manual method. When registering the skull models together, the difference between each set of skulls was found to be 0.4 mm ± 0.16 STD. Conclusions: We have developed a software to design custom cranio-orbital implant plans, with a focus on thin bone structures. The method described decreases design time, and is of similar accuracy to the manual method.
Resumo:
Above ground biomass is frequently estimated with forest inventory data and an extrapolation method for the per unit area evaluations. This procedure is labour demanding and costly. In this study above ground biomass functions, whose independent variable is crown horizontal projection, were developed. Multi-resolution segmentation method and object-oriented classification, based on very high spatial resolution satellite images, were used to obtain the area of tree crown horizontal projection for umbrella pine (Pinus pinea L.). A set of inventory plots were measured and with existing allometric functions for this species above ground biomass per tree and per plot were calculated. The two data sets were used to fit linear functions both for individual plot and their cumulative values. The results show a good performance of the models. Errors smaller than 10% are obtained for stand areas greater than 1.4 ha. These functions have the advantages of estimating above ground biomass for all the area under study or surveillance, not requiring forest inventory; allow monitoring in short time periods; and are easily implemented in a geographical information system environment.
Resumo:
Forest biomass has been having an increasing importance in the world economy and in the evaluation of the forests development and monitoring. It was identified as a global strategic reserve, due to its applications in bioenergy, bioproduct development and issues related to reducing greenhouse gas emissions. The estimation of above ground biomass is frequently done with allometric functions per species with plot inventory data. An adequate sampling design and intensity for an error threshold is required. The estimation per unit area is done using an extrapolation method. This procedure is labour demanding and costly. The mail goal of this study is the development of allometric functions for the estimation of above ground biomass with ground cover as independent variable, for forest areas of holm aok (Quercus rotundifolia), cork oak (Quercus suber) and umbrella pine (Pinus pinea) in multiple use systems. Ground cover per species was derived from crown horizontal projection obtained by processing high resolution satellite images, orthorectified, geometrically and atmospheric corrected, with multi-resolution segmentation method and object oriented classification. Forest inventory data were used to estimate plot above ground biomass with published allometric functions at tree level. The developed functions were fitted for monospecies stands and for multispecies stands of Quercus rotundifolia and Quercus suber, and Quercus suber and Pinus pinea. The stand composition was considered adding dummy variables to distinguish monospecies from multispecies stands. The models showed a good performance. Noteworthy is that the dummy variables, reflecting the differences between species, originated improvements in the models. Significant differences were found for above ground biomass estimation with the functions with and without the dummy variables. An error threshold of 10% corresponds to stand areas of about 40 ha. This method enables the overall area evaluation, not requiring extrapolation procedures, for the three species, which occur frequently in multispecies stands.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
An unsupervised approach to image segmentation which fuses region and boundary information is presented. The proposed approach takes advantage of the combined use of 3 different strategies: the guidance of seed placement, the control of decision criterion, and the boundary refinement. The new algorithm uses the boundary information to initialize a set of active regions which compete for the pixels in order to segment the whole image. The method is implemented on a multiresolution representation which ensures noise robustness as well as computation efficiency. The accuracy of the segmentation results has been proven through an objective comparative evaluation of the method
Resumo:
An unsupervised approach to image segmentation which fuses region and boundary information is presented. The proposed approach takes advantage of the combined use of 3 different strategies: the guidance of seed placement, the control of decision criterion, and the boundary refinement. The new algorithm uses the boundary information to initialize a set of active regions which compete for the pixels in order to segment the whole image. The method is implemented on a multiresolution representation which ensures noise robustness as well as computation efficiency. The accuracy of the segmentation results has been proven through an objective comparative evaluation of the method
Resumo:
Research on image processing has shown that combining segmentation methods may lead to a solid approach to extract semantic information from different sort of images. Within this context, the Normalized Cut (NCut) is usually used as a final partitioning tool for graphs modeled in some chosen method. This work explores the Watershed Transform as a modeling tool, using different criteria of the hierarchical Watershed to convert an image into an adjacency graph. The Watershed is combined with an unsupervised distance learning step that redistributes the graph weights and redefines the Similarity matrix, before the final segmentation step using NCut. Adopting the Berkeley Segmentation Data Set and Benchmark as a background, our goal is to compare the results obtained for this method with previous work to validate its performance.
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
Lateral ventricular volumes based on segmented brain MR images can be significantly underestimated if partial volume effects are not considered. This is because a group of voxels in the neighborhood of lateral ventricles is often mis-classified as gray matter voxels due to partial volume effects. This group of voxels is actually a mixture of ventricular cerebro-spinal fluid and the white matter and therefore, a portion of it should be included as part of the lateral ventricular structure. In this note, we describe an automated method for the measurement of lateral ventricular volumes on segmented brain MR images. Image segmentation was carried in combination of intensity correction and thresholding. The method is featured with a procedure for addressing mis-classified voxels in the surrounding of lateral ventricles. A detailed analysis showed that lateral ventricular volumes could be underestimated by 10 to 30% depending upon the size of the lateral ventricular structure, if mis-classified voxels were not included. Validation of the method was done through comparison with the averaged manually traced volumes. Finally, the merit of the method is demonstrated in the evaluation of the rate of lateral ventricular enlargement. (C) 2001 Elsevier Science Inc. All rights reserved.
Resumo:
We present a method for segmenting white matter tracts from high angular resolution diffusion MR. images by representing the data in a 5 dimensional space of position and orientation. Whereas crossing fiber tracts cannot be separated in 3D position space, they clearly disentangle in 5D position-orientation space. The segmentation is done using a 5D level set method applied to hyper-surfaces evolving in 5D position-orientation space. In this paper we present a methodology for constructing the position-orientation space. We then show how to implement the standard level set method in such a non-Euclidean high dimensional space. The level set theory is basically defined for N-dimensions but there are several practical implementation details to consider, such as mean curvature. Finally, we will show results from a synthetic model and a few preliminary results on real data of a human brain acquired by high angular resolution diffusion MRI.
Resumo:
In image processing, segmentation algorithms constitute one of the main focuses of research. In this paper, new image segmentation algorithms based on a hard version of the information bottleneck method are presented. The objective of this method is to extract a compact representation of a variable, considered the input, with minimal loss of mutual information with respect to another variable, considered the output. First, we introduce a split-and-merge algorithm based on the definition of an information channel between a set of regions (input) of the image and the intensity histogram bins (output). From this channel, the maximization of the mutual information gain is used to optimize the image partitioning. Then, the merging process of the regions obtained in the previous phase is carried out by minimizing the loss of mutual information. From the inversion of the above channel, we also present a new histogram clustering algorithm based on the minimization of the mutual information loss, where now the input variable represents the histogram bins and the output is given by the set of regions obtained from the above split-and-merge algorithm. Finally, we introduce two new clustering algorithms which show how the information bottleneck method can be applied to the registration channel obtained when two multimodal images are correctly aligned. Different experiments on 2-D and 3-D images show the behavior of the proposed algorithms
Resumo:
The large and growing number of digital images is making manual image search laborious. Only a fraction of the images contain metadata that can be used to search for a particular type of image. Thus, the main research question of this thesis is whether it is possible to learn visual object categories directly from images. Computers process images as long lists of pixels that do not have a clear connection to high-level semantics which could be used in the image search. There are various methods introduced in the literature to extract low-level image features and also approaches to connect these low-level features with high-level semantics. One of these approaches is called Bag-of-Features which is studied in the thesis. In the Bag-of-Features approach, the images are described using a visual codebook. The codebook is built from the descriptions of the image patches using clustering. The images are described by matching descriptions of image patches with the visual codebook and computing the number of matches for each code. In this thesis, unsupervised visual object categorisation using the Bag-of-Features approach is studied. The goal is to find groups of similar images, e.g., images that contain an object from the same category. The standard Bag-of-Features approach is improved by using spatial information and visual saliency. It was found that the performance of the visual object categorisation can be improved by using spatial information of local features to verify the matches. However, this process is computationally heavy, and thus, the number of images must be limited in the spatial matching, for example, by using the Bag-of-Features method as in this study. Different approaches for saliency detection are studied and a new method based on the Hessian-Affine local feature detector is proposed. The new method achieves comparable results with current state-of-the-art. The visual object categorisation performance was improved by using foreground segmentation based on saliency information, especially when the background could be considered as clutter.