32 resultados para vision-based place recognition
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
This book will serve as a foundation for a variety of useful applications of graph theory to computer vision, pattern recognition, and related areas. It covers a representative set of novel graph-theoretic methods for complex computer vision and pattern recognition tasks. The first part of the book presents the application of graph theory to low-level processing of digital images such as a new method for partitioning a given image into a hierarchy of homogeneous areas using graph pyramids, or a study of the relationship between graph theory and digital topology. Part II presents graph-theoretic learning algorithms for high-level computer vision and pattern recognition applications, including a survey of graph based methodologies for pattern recognition and computer vision, a presentation of a series of computationally efficient algorithms for testing graph isomorphism and related graph matching tasks in pattern recognition and a new graph distance measure to be used for solving graph matching problems. Finally, Part III provides detailed descriptions of several applications of graph-based methods to real-world pattern recognition tasks. It includes a critical review of the main graph-based and structural methods for fingerprint classification, a new method to visualize time series of graphs, and potential applications in computer network monitoring and abnormal event detection.
Resumo:
Computer vision-based food recognition could be used to estimate a meal's carbohydrate content for diabetic patients. This study proposes a methodology for automatic food recognition, based on the Bag of Features (BoF) model. An extensive technical investigation was conducted for the identification and optimization of the best performing components involved in the BoF architecture, as well as the estimation of the corresponding parameters. For the design and evaluation of the prototype system, a visual dataset with nearly 5,000 food images was created and organized into 11 classes. The optimized system computes dense local features, using the scale-invariant feature transform on the HSV color space, builds a visual dictionary of 10,000 visual words by using the hierarchical k-means clustering and finally classifies the food images with a linear support vector machine classifier. The system achieved classification accuracy of the order of 78%, thus proving the feasibility of the proposed approach in a very challenging image dataset.
Resumo:
Background: Individuals with type 1 diabetes (T1D) have to count the carbohydrates (CHOs) of their meal to estimate the prandial insulin dose needed to compensate for the meal’s effect on blood glucose levels. CHO counting is very challenging but also crucial, since an error of 20 grams can substantially impair postprandial control. Method: The GoCARB system is a smartphone application designed to support T1D patients with CHO counting of nonpacked foods. In a typical scenario, the user places a reference card next to the dish and acquires 2 images with his/her smartphone. From these images, the plate is detected and the different food items on the plate are automatically segmented and recognized, while their 3D shape is reconstructed. Finally, the food volumes are calculated and the CHO content is estimated by combining the previous results and using the USDA nutritional database. Results: To evaluate the proposed system, a set of 24 multi-food dishes was used. For each dish, 3 pairs of images were taken and for each pair, the system was applied 4 times. The mean absolute percentage error in CHO estimation was 10 ± 12%, which led to a mean absolute error of 6 ± 8 CHO grams for normal-sized dishes. Conclusion: The laboratory experiments demonstrated the feasibility of the GoCARB prototype system since the error was below the initial goal of 20 grams. However, further improvements and evaluation are needed prior launching a system able to meet the inter- and intracultural eating habits.
Resumo:
In retinal surgery, surgeons face difficulties such as indirect visualization of surgical targets, physiological tremor, and lack of tactile feedback, which increase the risk of retinal damage caused by incorrect surgical gestures. In this context, intraocular proximity sensing has the potential to overcome current technical limitations and increase surgical safety. In this paper, we present a system for detecting unintentional collisions between surgical tools and the retina using the visual feedback provided by the opthalmic stereo microscope. Using stereo images, proximity between surgical tools and the retinal surface can be detected when their relative stereo disparity is small. For this purpose, we developed a system comprised of two modules. The first is a module for tracking the surgical tool position on both stereo images. The second is a disparity tracking module for estimating a stereo disparity map of the retinal surface. Both modules were specially tailored for coping with the challenging visualization conditions in retinal surgery. The potential clinical value of the proposed method is demonstrated by extensive testing using a silicon phantom eye and recorded rabbit in vivo data.
Resumo:
In this paper we study the problem of blind deconvolution. Our analysis is based on the algorithm of Chan and Wong [2] which popularized the use of sparse gradient priors via total variation. We use this algorithm because many methods in the literature are essentially adaptations of this framework. Such algorithm is an iterative alternating energy minimization where at each step either the sharp image or the blur function are reconstructed. Recent work of Levin et al. [14] showed that any algorithm that tries to minimize that same energy would fail, as the desired solution has a higher energy than the no-blur solution, where the sharp image is the blurry input and the blur is a Dirac delta. However, experimentally one can observe that Chan and Wong's algorithm converges to the desired solution even when initialized with the no-blur one. We provide both analysis and experiments to resolve this paradoxical conundrum. We find that both claims are right. The key to understanding how this is possible lies in the details of Chan and Wong's implementation and in how seemingly harmless choices result in dramatic effects. Our analysis reveals that the delayed scaling (normalization) in the iterative step of the blur kernel is fundamental to the convergence of the algorithm. This then results in a procedure that eludes the no-blur solution, despite it being a global minimum of the original energy. We introduce an adaptation of this algorithm and show that, in spite of its extreme simplicity, it is very robust and achieves a performance comparable to the state of the art.
Resumo:
In this work we devise two novel algorithms for blind deconvolution based on a family of logarithmic image priors. In contrast to recent approaches, we consider a minimalistic formulation of the blind deconvolution problem where there are only two energy terms: a least-squares term for the data fidelity and an image prior based on a lower-bounded logarithm of the norm of the image gradients. We show that this energy formulation is sufficient to achieve the state of the art in blind deconvolution with a good margin over previous methods. Much of the performance is due to the chosen prior. On the one hand, this prior is very effective in favoring sparsity of the image gradients. On the other hand, this prior is non convex. Therefore, solutions that can deal effectively with local minima of the energy become necessary. We devise two iterative minimization algorithms that at each iteration solve convex problems: one obtained via the primal-dual approach and one via majorization-minimization. While the former is computationally efficient, the latter achieves state-of-the-art performance on a public dataset.
Resumo:
The study was designed to determine comparatively the prognostic value of immunoblotting and ELISA in the serological follow-up of young cystic echinococcosis (CE) patients exhibiting either a cured or a progredient (non-cured) course of disease after treatment. A total of 54 patients (mean age 9 years, range from 3 to 15 years) with surgically, radiologically and/or histologically proven CE were studied for a period up to 60 months after surgery. Additionally, some of the patients underwent chemotherapy. Based on the clinical course and outcome, as well as on imaging findings, patients were clustered into 2 groups of either cured (CCE), or non-cured (NCCE) CE patients. ELISA showed a high rate of seropositivity 4 to 5 years post-surgery for both CCE (57.1%) and NCCE (100%) patients, the difference found between the two groups was statistically not significant. Immunoblotting based upon recognition of AgB subcomponents (8 and 16 kDa bands) showed a decrease of respective antibody reactivities after 4 years post-surgery. Only sera from 14.3% of CCE patients recognized the subcomponents of AgB after 4 years, while none (0%) of these sera was still reactive at 5 years post-surgery. At variance, immunoblotting remained positive for AgB subcomponents in 100% of the NCCE cases as tested between 4 and 5 years after surgical treatment. Immunoblotting therefore proved to be a useful approach for monitoring post-surgical follow-ups of human CCE and NCCE in young patients when based upon the recognition of AgB subcomponents.
Resumo:
Purpose: Most recently light and mobile reading devices with high display resolutions have become popular and they may open new possibilities for reading applications in education, business and the private sector. The ability to adapt font size may also open new reading opportunities for people with impaired or low vision. Based on their display technology two major groups of reading devices can be distinguished. One type, predominantly found in dedicated e-book readers, uses electronic paper also known as e-Ink. Other devices, mostly multifunction tablet-PCs, are equipped with backlit LCD displays. While it has long been accepted that reading on electronic displays is slow and associated with visual fatigue, this new generation is explicitly promoted for reading. Since research has shown that, compared to reading on electronic displays, reading on paper is faster and requires fewer fixations per line, one would expect differential effects when comparing reading behaviour on e-Ink and LCD. In the present study we therefore compared experimentally how these two display types are suited for reading over an extended period of time. Methods: Participants read for several hours on either e-Ink or LCD, and different measures of reading behaviour and visual strain were regularly recorded. These dependent measures included subjective (visual) fatigue, a letter search task, reading speed, oculomotor behaviour and the pupillary light reflex. Results: Results suggested that reading on the two display types is very similar in terms of both subjective and objective measures. Conclusions: It is not the technology itself, but rather the image quality that seems crucial for reading. Compared to the visual display units used in the previous few decades, these more recent electronic displays allow for good and comfortable reading, even for extended periods of time.