38 resultados para computer vision,stereo camera,people tracking,deep learning,plane detection,3d vision
em BORIS: Bern Open Repository and Information System - Berna - Sui
Resumo:
This book will serve as a foundation for a variety of useful applications of graph theory to computer vision, pattern recognition, and related areas. It covers a representative set of novel graph-theoretic methods for complex computer vision and pattern recognition tasks. The first part of the book presents the application of graph theory to low-level processing of digital images such as a new method for partitioning a given image into a hierarchy of homogeneous areas using graph pyramids, or a study of the relationship between graph theory and digital topology. Part II presents graph-theoretic learning algorithms for high-level computer vision and pattern recognition applications, including a survey of graph based methodologies for pattern recognition and computer vision, a presentation of a series of computationally efficient algorithms for testing graph isomorphism and related graph matching tasks in pattern recognition and a new graph distance measure to be used for solving graph matching problems. Finally, Part III provides detailed descriptions of several applications of graph-based methods to real-world pattern recognition tasks. It includes a critical review of the main graph-based and structural methods for fingerprint classification, a new method to visualize time series of graphs, and potential applications in computer network monitoring and abnormal event detection.
Resumo:
Background: Sensor-based recordings of human movements are becoming increasingly important for the assessment of motor symptoms in neurological disorders beyond rehabilitative purposes. ASSESS MS is a movement recording and analysis system being developed to automate the classification of motor dysfunction in patients with multiple sclerosis (MS) using depth-sensing computer vision. It aims to provide a more consistent and finer-grained measurement of motor dysfunction than currently possible. Objective: To test the usability and acceptability of ASSESS MS with health professionals and patients with MS. Methods: A prospective, mixed-methods study was carried out at 3 centers. After a 1-hour training session, a convenience sample of 12 health professionals (6 neurologists and 6 nurses) used ASSESS MS to capture recordings of standardized movements performed by 51 volunteer patients. Metrics for effectiveness, efficiency, and acceptability were defined and used to analyze data captured by ASSESS MS, video recordings of each examination, feedback questionnaires, and follow-up interviews. Results: All health professionals were able to complete recordings using ASSESS MS, achieving high levels of standardization on 3 of 4 metrics (movement performance, lateral positioning, and clear camera view but not distance positioning). Results were unaffected by patients’ level of physical or cognitive disability. ASSESS MS was perceived as easy to use by both patients and health professionals with high scores on the Likert-scale questions and positive interview commentary. ASSESS MS was highly acceptable to patients on all dimensions considered, including attitudes to future use, interaction (with health professionals), and overall perceptions of ASSESS MS. Health professionals also accepted ASSESS MS, but with greater ambivalence arising from the need to alter patient interaction styles. There was little variation in results across participating centers, and no differences between neurologists and nurses. Conclusions: In typical clinical settings, ASSESS MS is usable and acceptable to both patients and health professionals, generating data of a quality suitable for clinical analysis. An iterative design process appears to have been successful in accounting for factors that permit ASSESS MS to be used by a range of health professionals in new settings with minimal training. The study shows the potential of shifting ubiquitous sensing technologies from research into the clinic through a design approach that gives appropriate attention to the clinic environment.
Resumo:
Background: Individuals with type 1 diabetes (T1D) have to count the carbohydrates (CHOs) of their meal to estimate the prandial insulin dose needed to compensate for the meal’s effect on blood glucose levels. CHO counting is very challenging but also crucial, since an error of 20 grams can substantially impair postprandial control. Method: The GoCARB system is a smartphone application designed to support T1D patients with CHO counting of nonpacked foods. In a typical scenario, the user places a reference card next to the dish and acquires 2 images with his/her smartphone. From these images, the plate is detected and the different food items on the plate are automatically segmented and recognized, while their 3D shape is reconstructed. Finally, the food volumes are calculated and the CHO content is estimated by combining the previous results and using the USDA nutritional database. Results: To evaluate the proposed system, a set of 24 multi-food dishes was used. For each dish, 3 pairs of images were taken and for each pair, the system was applied 4 times. The mean absolute percentage error in CHO estimation was 10 ± 12%, which led to a mean absolute error of 6 ± 8 CHO grams for normal-sized dishes. Conclusion: The laboratory experiments demonstrated the feasibility of the GoCARB prototype system since the error was below the initial goal of 20 grams. However, further improvements and evaluation are needed prior launching a system able to meet the inter- and intracultural eating habits.
Resumo:
Rho guanosine triphosphatases (GTPases) control the cytoskeletal dynamics that power neurite outgrowth. This process consists of dynamic neurite initiation, elongation, retraction, and branching cycles that are likely to be regulated by specific spatiotemporal signaling networks, which cannot be resolved with static, steady-state assays. We present NeuriteTracker, a computer-vision approach to automatically segment and track neuronal morphodynamics in time-lapse datasets. Feature extraction then quantifies dynamic neurite outgrowth phenotypes. We identify a set of stereotypic neurite outgrowth morphodynamic behaviors in a cultured neuronal cell system. Systematic RNA interference perturbation of a Rho GTPase interactome consisting of 219 proteins reveals a limited set of morphodynamic phenotypes. As proof of concept, we show that loss of function of two distinct RhoA-specific GTPase-activating proteins (GAPs) leads to opposite neurite outgrowth phenotypes. Imaging of RhoA activation dynamics indicates that both GAPs regulate different spatiotemporal Rho GTPase pools, with distinct functions. Our results provide a starting point to dissect spatiotemporal Rho GTPase signaling networks that regulate neurite outgrowth.
Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network
Resumo:
Automated tissue characterization is one of the most crucial components of a computer aided diagnosis (CAD) system for interstitial lung diseases (ILDs). Although much research has been conducted in this field, the problem remains challenging. Deep learning techniques have recently achieved impressive results in a variety of computer vision problems, raising expectations that they might be applied in other domains, such as medical image analysis. In this paper, we propose and evaluate a convolutional neural network (CNN), designed for the classification of ILD patterns. The proposed network consists of 5 convolutional layers with 2×2 kernels and LeakyReLU activations, followed by average pooling with size equal to the size of the final feature maps and three dense layers. The last dense layer has 7 outputs, equivalent to the classes considered: healthy, ground glass opacity (GGO), micronodules, consolidation, reticulation, honeycombing and a combination of GGO/reticulation. To train and evaluate the CNN, we used a dataset of 14696 image patches, derived by 120 CT scans from different scanners and hospitals. To the best of our knowledge, this is the first deep CNN designed for the specific problem. A comparative analysis proved the effectiveness of the proposed CNN against previous methods in a challenging dataset. The classification performance (~85.5%) demonstrated the potential of CNNs in analyzing lung patterns. Future work includes, extending the CNN to three-dimensional data provided by CT volume scans and integrating the proposed method into a CAD system that aims to provide differential diagnosis for ILDs as a supportive tool for radiologists.
Resumo:
Images of an object under different illumination are known to provide strong cues about the object surface. A mathematical formalization of how to recover the normal map of such a surface leads to the so-called uncalibrated photometric stereo problem. In the simplest instance, this problem can be reduced to the task of identifying only three parameters: the so-called generalized bas-relief (GBR) ambiguity. The challenge is to find additional general assumptions about the object, that identify these parameters uniquely. Current approaches are not consistent, i.e., they provide different solutions when run multiple times on the same data. To address this limitation, we propose exploiting local diffuse reflectance (LDR) maxima, i.e., points in the scene where the normal vector is parallel to the illumination direction (see Fig. 1). We demonstrate several noteworthy properties of these maxima: a closed-form solution, computational efficiency and GBR consistency. An LDR maximum yields a simple closed-form solution corresponding to a semi-circle in the GBR parameters space (see Fig. 2); because as few as two diffuse maxima in different images identify a unique solution, the identification of the GBR parameters can be achieved very efficiently; finally, the algorithm is consistent as it always returns the same solution given the same data. Our algorithm is also remarkably robust: It can obtain an accurate estimate of the GBR parameters even with extremely high levels of outliers in the detected maxima (up to 80 % of the observations). The method is validated on real data and achieves state-of-the-art results.
Resumo:
Diet management is a key factor for the prevention and treatment of diet-related chronic diseases. Computer vision systems aim to provide automated food intake assessment using meal images. We propose a method for the recognition of already segmented food items in meal images. The method uses a 6-layer deep convolutional neural network to classify food image patches. For each food item, overlapping patches are extracted and classified and the class with the majority of votes is assigned to it. Experiments on a manually annotated dataset with 573 food items justified the choice of the involved components and proved the effectiveness of the proposed system yielding an overall accuracy of 84.9%.
Resumo:
In this paper we propose a solution to blind deconvolution of a scene with two layers (foreground/background). We show that the reconstruction of the support of these two layers from a single image of a conventional camera is not possible. As a solution we propose to use a light field camera. We demonstrate that a single light field image captured with a Lytro camera can be successfully deblurred. More specifically, we consider the case of space-varying motion blur, where the blur magnitude depends on the depth changes in the scene. Our method employs a layered model that handles occlusions and partial transparencies due to both motion blur and out of focus blur of the plenoptic camera. We reconstruct each layer support, the corresponding sharp textures, and motion blurs via an optimization scheme. The performance of our algorithm is demonstrated on synthetic as well as real light field images.
Resumo:
The contribution of this article demonstrates how to identify context-aware types of e-Learning objects (eLOs) derived from the subject domains. This perspective is taken from an engineering point of view and is applied during requirements elicitation and analysis relating to present work in constructing an object-oriented (OO), dynamic, and adaptive model to build and deliver packaged e-Learning courses. Consequently, three preliminary subject domains are presented and, as a result, three primitive types of eLOs are posited. These types educed from the subject domains are of structural, conceptual, and granular nature. Structural objects are responsible for the course itself, conceptual objects incorporate adaptive and logical interoperability, while granular objects congregate granular assets. Their differences, interrelationships, and responsibilities are discussed. A major design challenge relates to adaptive behaviour. Future research addresses refinement on the subject domains and adaptive hypermedia systems.
Resumo:
In this paper, we present a consolidation method that is based on a new representation of 3D point sets. The key idea is to augment each surface point into a deep point by associating it with an inner point that resides on the meso-skeleton, which consists of a mixture of skeletal curves and sheets. The deep points representation is a result of a joint optimization applied to both ends of the deep points. The optimization objective is to fairly distribute the end points across the surface and the meso-skeleton, such that the deep point orientations agree with the surface normals. The optimization converges where the inner points form a coherent meso-skeleton, and the surface points are consolidated with the missing regions completed. The strength of this new representation stems from the fact that it is comprised of both local and non-local geometric information. We demonstrate the advantages of the deep points consolidation technique by employing it to consolidate and complete noisy point-sampled geometry with large missing parts.