9 resultados para 3D scene understanding

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Safe collaboration between a robot and human operator forms a critical requirement for deploying a robotic system into a manufacturing and testing environment. In this dissertation, the safety requirement for is developed and implemented for the navigation system of the mobile manipulators. A methodology for human-robot co-existence through a 3d scene analysis is also investigated. The proposed approach exploits the advance in computing capability by relying on graphic processing units (GPU’s) for volumetric predictive human-robot contact checking. Apart from guaranteeing safety of operators, human-robot collaboration is also fundamental when cooperative activities are required, as in appliance test automation floor. To achieve this, a generalized hierarchical task controller scheme for collision avoidance is developed. This allows the robotic arm to safely approach and inspect the interior of the appliance without collision during the testing procedure. The unpredictable presence of the operators also forms dynamic obstacle that changes very fast, thereby requiring a quick reaction from the robot side. In this aspect, a GPU-accelarated distance field is computed to speed up reaction time to avoid collision between human operator and the robot. An automated appliance testing also involves robotized laundry loading and unloading during life cycle testing. This task involves Laundry detection, grasp pose estimation and manipulation in a container, inside the drum and during recovery grasping. A wrinkle and blob detection algorithms for grasp pose estimation are developed and grasp poses are calculated along the wrinkle and blobs to efficiently perform grasping task. By ranking the estimated laundry grasp poses according to a predefined cost function, the robotic arm attempt to grasp poses that are more comfortable from the robot kinematic side as well as collision free on the appliance side. This is achieved through appliance detection and full-model registration and collision free trajectory execution using online collision avoidance.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This thesis investigates interactive scene reconstruction and understanding using RGB-D data only. Indeed, we believe that depth cameras will still be in the near future a cheap and low-power 3D sensing alternative suitable for mobile devices too. Therefore, our contributions build on top of state-of-the-art approaches to achieve advances in three main challenging scenarios, namely mobile mapping, large scale surface reconstruction and semantic modeling. First, we will describe an effective approach dealing with Simultaneous Localization And Mapping (SLAM) on platforms with limited resources, such as a tablet device. Unlike previous methods, dense reconstruction is achieved by reprojection of RGB-D frames, while local consistency is maintained by deploying relative bundle adjustment principles. We will show quantitative results comparing our technique to the state-of-the-art as well as detailed reconstruction of various environments ranging from rooms to small apartments. Then, we will address large scale surface modeling from depth maps exploiting parallel GPU computing. We will develop a real-time camera tracking method based on the popular KinectFusion system and an online surface alignment technique capable of counteracting drift errors and closing small loops. We will show very high quality meshes outperforming existing methods on publicly available datasets as well as on data recorded with our RGB-D camera even in complete darkness. Finally, we will move to our Semantic Bundle Adjustment framework to effectively combine object detection and SLAM in a unified system. Though the mathematical framework we will describe does not restrict to a particular sensing technology, in the experimental section we will refer, again, only to RGB-D sensing. We will discuss successful implementations of our algorithm showing the benefit of a joint object detection, camera tracking and environment mapping.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The first mechanical Automaton concept was found in a Chinese text written in the 3rd century BC, while Computer Vision was born in the late 1960s. Therefore, visual perception applied to machines (i.e. the Machine Vision) is a young and exciting alliance. When robots came in, the new field of Robotic Vision was born, and these terms began to be erroneously interchanged. In short, we can say that Machine Vision is an engineering domain, which concern the industrial use of Vision. The Robotic Vision, instead, is a research field that tries to incorporate robotics aspects in computer vision algorithms. Visual Servoing, for example, is one of the problems that cannot be solved by computer vision only. Accordingly, a large part of this work deals with boosting popular Computer Vision techniques by exploiting robotics: e.g. the use of kinematics to localize a vision sensor, mounted as the robot end-effector. The remainder of this work is dedicated to the counterparty, i.e. the use of computer vision to solve real robotic problems like grasping objects or navigate avoiding obstacles. Will be presented a brief survey about mapping data structures most widely used in robotics along with SkiMap, a novel sparse data structure created both for robotic mapping and as a general purpose 3D spatial index. Thus, several approaches to implement Object Detection and Manipulation, by exploiting the aforementioned mapping strategies, will be proposed, along with a completely new Machine Teaching facility in order to simply the training procedure of modern Deep Learning networks.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Sketches are a unique way to communicate: drawing a simple sketch does not require any training, sketches convey information that is hard to describe with words, they are powerful enough to represent almost any concept, and nowadays, it is possible to draw directly from mobile devices. Motivated from the unique characteristics of sketches and fascinated by the human ability to imagine 3D objects from drawings, this thesis focuses on automatically associating geometric information to sketches. The main research directions of the thesis can be summarized as obtaining geometric information from freehand scene sketches to improve 2D sketch-based tasks and investigating Vision-Language models to overcome 3D sketch-based tasks limitations. The first part of the thesis concerns geometric information prediction from scene sketches improving scene sketch to image generation and unlocking new creativity effects. The thesis proceeds showing a study conducted on the Vision-Language models embedding space considering sketches, line renderings and RGB renderings of 3D shape to overcome the use of supervised datasets for 3D sketch-based tasks, that are limited and hard to acquire. Following the obtained observations and results, Vision-Language models are applied to Sketch Based Shape Retrieval without the need of training on supervised datasets. We then analyze the use of Vision-Language models for sketch based 3D reconstruction in an unsupervised manner. In the final chapter we report the results obtained in an additional project carried during the PhD, which has lead to the development of a framework to learn an embedding space of neural networks that can be navigated to get ready-to-use models with desired characteristics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Natural hazard related to the volcanic activity represents a potential risk factor, particularly in the vicinity of human settlements. Besides to the risk related to the explosive and effusive activity, the instability of volcanic edifices may develop into large landslides often catastrophically destructive, as shown by the collapse of the northern flank of Mount St. Helens in 1980. A combined approach was applied to analyse slope failures that occurred at Stromboli volcano. SdF slope stability was evaluated by using high-resolution multi-temporal DTMMs and performing limit equilibrium stability analyses. High-resolution topographical data collected with remote sensing techniques and three-dimensional slope stability analysis play a key role in understanding instability mechanism and the related risks. Analyses carried out on the 2002–2003 and 2007 Stromboli eruptions, starting from high-resolution data acquired through airborne remote sensing surveys, permitted the estimation of the lava volumes emplaced on the SdF slope and contributed to the investigation of the link between magma emission and slope instabilities. Limit Equilibrium analyses were performed on the 2001 and 2007 3D models, in order to simulate the slope behavior before 2002-2003 landslide event and after the 2007 eruption. Stability analyses were conducted to understand the mechanisms that controlled the slope deformations which occurred shortly after the 2007 eruption onset, involving the upper part of slope. Limit equilibrium analyses applied to both cases yielded results which are congruent with observations and monitoring data. The results presented in this work undoubtedly indicate that hazard assessment for the island of Stromboli should take into account the fact that a new magma intrusion could lead to further destabilisation of the slope, which may be more significant than the one recently observed because it will affect an already disarranged deposit and fractured and loosened crater area. The two-pronged approach based on the analysis of 3D multi-temporal mapping datasets and on the application of LE methods contributed to better understanding volcano flank behaviour and to be prepared to undertake actions aimed at risk mitigation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The term Ambient Intelligence (AmI) refers to a vision on the future of the information society where smart, electronic environment are sensitive and responsive to the presence of people and their activities (Context awareness). In an ambient intelligence world, devices work in concert to support people in carrying out their everyday life activities, tasks and rituals in an easy, natural way using information and intelligence that is hidden in the network connecting these devices. This promotes the creation of pervasive environments improving the quality of life of the occupants and enhancing the human experience. AmI stems from the convergence of three key technologies: ubiquitous computing, ubiquitous communication and natural interfaces. Ambient intelligent systems are heterogeneous and require an excellent cooperation between several hardware/software technologies and disciplines, including signal processing, networking and protocols, embedded systems, information management, and distributed algorithms. Since a large amount of fixed and mobile sensors embedded is deployed into the environment, the Wireless Sensor Networks is one of the most relevant enabling technologies for AmI. WSN are complex systems made up of a number of sensor nodes which can be deployed in a target area to sense physical phenomena and communicate with other nodes and base stations. These simple devices typically embed a low power computational unit (microcontrollers, FPGAs etc.), a wireless communication unit, one or more sensors and a some form of energy supply (either batteries or energy scavenger modules). WNS promises of revolutionizing the interactions between the real physical worlds and human beings. Low-cost, low-computational power, low energy consumption and small size are characteristics that must be taken into consideration when designing and dealing with WSNs. To fully exploit the potential of distributed sensing approaches, a set of challengesmust be addressed. Sensor nodes are inherently resource-constrained systems with very low power consumption and small size requirements which enables than to reduce the interference on the physical phenomena sensed and to allow easy and low-cost deployment. They have limited processing speed,storage capacity and communication bandwidth that must be efficiently used to increase the degree of local ”understanding” of the observed phenomena. A particular case of sensor nodes are video sensors. This topic holds strong interest for a wide range of contexts such as military, security, robotics and most recently consumer applications. Vision sensors are extremely effective for medium to long-range sensing because vision provides rich information to human operators. However, image sensors generate a huge amount of data, whichmust be heavily processed before it is transmitted due to the scarce bandwidth capability of radio interfaces. In particular, in video-surveillance, it has been shown that source-side compression is mandatory due to limited bandwidth and delay constraints. Moreover, there is an ample opportunity for performing higher-level processing functions, such as object recognition that has the potential to drastically reduce the required bandwidth (e.g. by transmitting compressed images only when something ‘interesting‘ is detected). The energy cost of image processing must however be carefully minimized. Imaging could play and plays an important role in sensing devices for ambient intelligence. Computer vision can for instance be used for recognising persons and objects and recognising behaviour such as illness and rioting. Having a wireless camera as a camera mote opens the way for distributed scene analysis. More eyes see more than one and a camera system that can observe a scene from multiple directions would be able to overcome occlusion problems and could describe objects in their true 3D appearance. In real-time, these approaches are a recently opened field of research. In this thesis we pay attention to the realities of hardware/software technologies and the design needed to realize systems for distributed monitoring, attempting to propose solutions on open issues and filling the gap between AmI scenarios and hardware reality. The physical implementation of an individual wireless node is constrained by three important metrics which are outlined below. Despite that the design of the sensor network and its sensor nodes is strictly application dependent, a number of constraints should almost always be considered. Among them: • Small form factor to reduce nodes intrusiveness. • Low power consumption to reduce battery size and to extend nodes lifetime. • Low cost for a widespread diffusion. These limitations typically result in the adoption of low power, low cost devices such as low powermicrocontrollers with few kilobytes of RAMand tenth of kilobytes of program memory with whomonly simple data processing algorithms can be implemented. However the overall computational power of the WNS can be very large since the network presents a high degree of parallelism that can be exploited through the adoption of ad-hoc techniques. Furthermore through the fusion of information from the dense mesh of sensors even complex phenomena can be monitored. In this dissertation we present our results in building several AmI applications suitable for a WSN implementation. The work can be divided into two main areas:Low Power Video Sensor Node and Video Processing Alghoritm and Multimodal Surveillance . Low Power Video Sensor Nodes and Video Processing Alghoritms In comparison to scalar sensors, such as temperature, pressure, humidity, velocity, and acceleration sensors, vision sensors generate much higher bandwidth data due to the two-dimensional nature of their pixel array. We have tackled all the constraints listed above and have proposed solutions to overcome the current WSNlimits for Video sensor node. We have designed and developed wireless video sensor nodes focusing on the small size and the flexibility of reuse in different applications. The video nodes target a different design point: the portability (on-board power supply, wireless communication), a scanty power budget (500mW),while still providing a prominent level of intelligence, namely sophisticated classification algorithmand high level of reconfigurability. We developed two different video sensor node: The device architecture of the first one is based on a low-cost low-power FPGA+microcontroller system-on-chip. The second one is based on ARM9 processor. Both systems designed within the above mentioned power envelope could operate in a continuous fashion with Li-Polymer battery pack and solar panel. Novel low power low cost video sensor nodes which, in contrast to sensors that just watch the world, are capable of comprehending the perceived information in order to interpret it locally, are presented. Featuring such intelligence, these nodes would be able to cope with such tasks as recognition of unattended bags in airports, persons carrying potentially dangerous objects, etc.,which normally require a human operator. Vision algorithms for object detection, acquisition like human detection with Support Vector Machine (SVM) classification and abandoned/removed object detection are implemented, described and illustrated on real world data. Multimodal surveillance: In several setup the use of wired video cameras may not be possible. For this reason building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. Energy efficiency for wireless smart camera networks is one of the major efforts in distributed monitoring and surveillance community. For this reason, building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. The Pyroelectric Infra-Red (PIR) sensors have been used to extend the lifetime of a solar-powered video sensor node by providing an energy level dependent trigger to the video camera and the wireless module. Such approach has shown to be able to extend node lifetime and possibly result in continuous operation of the node.Being low-cost, passive (thus low-power) and presenting a limited form factor, PIR sensors are well suited for WSN applications. Moreover techniques to have aggressive power management policies are essential for achieving long-termoperating on standalone distributed cameras needed to improve the power consumption. We have used an adaptive controller like Model Predictive Control (MPC) to help the system to improve the performances outperforming naive power management policies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The primary objective of this thesis is to obtain a better understanding of the 3D velocity structure of the lithosphere in central Italy. To this end, I adopted the Spectral-Element Method to perform accurate numerical simulations of the complex wavefields generated by the 2009 Mw 6.3 L’Aquila event and by its foreshocks and aftershocks together with some additional events within our target region. For the mainshock, the source was represented by a finite fault and different models for central Italy, both 1D and 3D, were tested. Surface topography, attenuation and Moho discontinuity were also accounted for. Three-component synthetic waveforms were compared to the corresponding recorded data. The results of these analyses show that 3D models, including all the known structural heterogeneities in the region, are essential to accurately reproduce waveform propagation. They allow to capture features of the seismograms, mainly related to topography or to low wavespeed areas, and, combined with a finite fault model, result into a favorable match between data and synthetics for frequencies up to ~0.5 Hz. We also obtained peak ground velocity maps, that provide valuable information for seismic hazard assessment. The remaining differences between data and synthetics led us to take advantage of SEM combined with an adjoint method to iteratively improve the available 3D structure model for central Italy. A total of 63 events and 52 stations in the region were considered. We performed five iterations of the tomographic inversion, by calculating the misfit function gradient - necessary for the model update - from adjoint sensitivity kernels, constructed using only two simulations for each event. Our last updated model features a reduced traveltime misfit function and improved agreement between data and synthetics, although further iterations, as well as refined source solutions, are necessary to obtain a new reference 3D model for central Italy tomography.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Depth represents a crucial piece of information in many practical applications, such as obstacle avoidance and environment mapping. This information can be provided either by active sensors, such as LiDARs, or by passive devices like cameras. A popular passive device is the binocular rig, which allows triangulating the depth of the scene through two synchronized and aligned cameras. However, many devices that are already available in several infrastructures are monocular passive sensors, such as most of the surveillance cameras. The intrinsic ambiguity of the problem makes monocular depth estimation a challenging task. Nevertheless, the recent progress of deep learning strategies is paving the way towards a new class of algorithms able to handle this complexity. This work addresses many relevant topics related to the monocular depth estimation problem. It presents networks capable of predicting accurate depth values even on embedded devices and without the need of expensive ground-truth labels at training time. Moreover, it introduces strategies to estimate the uncertainty of these models, and it shows that monocular networks can easily generate training labels for different tasks at scale. Finally, it evaluates off-the-shelf monocular depth predictors for the relevant use case of social distance monitoring, and shows how this technology allows to overcome already existing strategies limitations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation contributes to the scholarly debate on temporary teams by exploring team interactions and boundaries.The fundamental challenge in temporary teams originates from temporary participation in the teams. First, as participants join the team for a short period of time, there is not enough time to build trust, share understanding, and have effective interactions. Consequently, team outputs and practices built on team interactions become vulnerable. Secondly, as team participants move on and off the teams, teams’ boundaries become blurred over time. It leads to uncertainty among team participants and leaders about who is/is not identified as a team member causing collective disagreement within the team. Focusing on the above mentioned challenges, we conducted this research in healthcare organisations since the use of temporary teams in healthcare and hospital setting is prevalent. In particular, we focused on orthopaedic teams that provide personalised treatments for patients using 3D printing technology. Qualitative and quantitative data were collected using interviews, observations, questionnaires and archival data at Rizzoli Orthopaedic Institute, Bologna, Italy. This study provides the following research outputs. The first is a conceptual study that explores temporary teams’ literature using bibliometric analysis and systematic literature review to highlight research gaps. The second paper qualitatively studies temporary relationships within the teams by collecting data using group interviews and observations. The results highlighted the role of short-term dyadic relationships as a ground to share and transfer knowledge at the team level. Moreover, hierarchical structure of the teams facilitates knowledge sharing by supporting dyadic relationships within and beyond the team meetings. The third paper investigates impact of blurred boundaries on temporary teams’ performance. Using quantitative data collected through questionnaires and archival data, we concluded that boundary blurring in terms of fluidity, overlap and dispersion differently impacts team performance at high and low levels of task complexity.