981 resultados para 3D scene understanding
Resumo:
In this project, we propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: the system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The range of algorithms and applications to be implemented and integrated will be quite broad, ranging from the acquisition, outlier removal or filtering of the input data and the segmentation or characterization of regions of interest in the scene to the very object recognition and pose estimation. Furthermore, in order to validate the proposed system, we will create a 3D object dataset. It will be composed by a set of 3D models, reconstructed from common household objects, as well as a handful of test scenes in which those objects appear. The scenes will be characterized by different levels of occlusion, diverse distances from the elements to the sensor and variations on the pose of the target objects. The creation of this dataset implies the additional development of 3D data acquisition and 3D object reconstruction applications. The resulting system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human-computer interaction (HCI) systems based on visual information.
Resumo:
Many applications including object reconstruction, robot guidance, and. scene mapping require the registration of multiple views from a scene to generate a complete geometric and appearance model of it. In real situations, transformations between views are unknown and it is necessary to apply expert inference to estimate them. In the last few years, the emergence of low-cost depth-sensing cameras has strengthened the research on this topic, motivating a plethora of new applications. Although they have enough resolution and accuracy for many applications, some situations may not be solved with general state-of-the-art registration methods due to the signal-to-noise ratio (SNR) and the resolution of the data provided. The problem of working with low SNR data, in general terms, may appear in any 3D system, then it is necessary to propose novel solutions in this aspect. In this paper, we propose a method, μ-MAR, able to both coarse and fine register sets of 3D points provided by low-cost depth-sensing cameras, despite it is not restricted to these sensors, into a common coordinate system. The method is able to overcome the noisy data problem by means of using a model-based solution of multiplane registration. Specifically, it iteratively registers 3D markers composed by multiple planes extracted from points of multiple views of the scene. As the markers and the object of interest are static in the scenario, the transformations obtained for the markers are applied to the object in order to reconstruct it. Experiments have been performed using synthetic and real data. The synthetic data allows a qualitative and quantitative evaluation by means of visual inspection and Hausdorff distance respectively. The real data experiments show the performance of the proposal using data acquired by a Primesense Carmine RGB-D sensor. The method has been compared to several state-of-the-art methods. The results show the good performance of the μ-MAR to register objects with high accuracy in presence of noisy data outperforming the existing methods.
Resumo:
La condición tridimensional de la construcción edificatoria precisa del uso del dibujo en 3D como la mejor herramienta de proyecto y transmisión de conocimientos técnicos y formales. El objetivo de esta comunicación es mostrar la aplicación de la expresión gráfica en 3D en un análisis histórico sobre la evolución de la envolvente industrializada en arquitectura, identificando sus principales condicionantes técnicos y constructivos. El estudio compara la evolución del uso de sistemas constructivos industrializados mediante un análisis gráfico de las soluciones constructivas más destacables. La metodología empleada se basa en la identificación y estudio de determinados sistemas constructivos industrializados compuestos por materiales ligeros así como de obras de arquitectura representativas por su influencia en la evolución de la envolvente arquitectónica en la segunda mitad del siglo XX. La representación gráfica en 3D ayuda a comparar las obras analizadas desde aspectos tecnológicos y formales, constatándose la utilidad del dibujo asistido por ordenador en el análisis constructivo realizado. En conclusión, el uso del dibujo arquitectónico en 3D contribuye, por la mejor comprensión de las características espaciales de las soluciones constructivas, al análisis de las propiedades materiales y funcionales de los sistemas constructivos industrializados y su aplicación al diseño arquitectónico, ayudando a perfeccionar su conocimiento e incrementando la calidad constructiva y compromiso social de las propuestas arquitectónicas.
Resumo:
During grasping and intelligent robotic manipulation tasks, the camera position relative to the scene changes dramatically because the robot is moving to adapt its path and correctly grasp objects. This is because the camera is mounted at the robot effector. For this reason, in this type of environment, a visual recognition system must be implemented to recognize and “automatically and autonomously” obtain the positions of objects in the scene. Furthermore, in industrial environments, all objects that are manipulated by robots are made of the same material and cannot be differentiated by features such as texture or color. In this work, first, a study and analysis of 3D recognition descriptors has been completed for application in these environments. Second, a visual recognition system designed from specific distributed client-server architecture has been proposed to be applied in the recognition process of industrial objects without these appearance features. Our system has been implemented to overcome problems of recognition when the objects can only be recognized by geometric shape and the simplicity of shapes could create ambiguity. Finally, some real tests are performed and illustrated to verify the satisfactory performance of the proposed system.
Resumo:
Cultural heritage sites all over the world are at risk due to aggressive urban expansion, development, wars and general obsolescence. Not all objects are recorded in detail although they may have social and historical significance. For example more emphasis is placed on the recording of castles and palaces than on crofters’ cottages or tenement blocks, although their history can be just as rich. This paper will investigate the historic fabric of Aberdeen through the use of digital scanning, supported by a range of media including old photographs and paintings. Dissemination of social heritage through visualisations will be explored and how this can aid the understanding of space within the city or specific area. Focus will be given to the major statues/monuments within the context of the city centre, exploring their importance in their environment. In addition studying why many have been re-located away from their original site, the reasons why, and how we have perhaps lost some of the social and historical importance of why that monument was first located there. It will be argued that Digital Media could be utilised for much more than re-creation and re-presentation of physical entities. Digital scanning, in association with visualisation tools, is used to capture the essence of both the cultural heritage and the society that created or used the sites in association with visualisation tools and in some way re-enacting the original importance placed upon the monument in its original location, through adoption of BIM Heritage.
Resumo:
Since the beginning of 3D computer vision problems, the use of techniques to reduce the data to make it treatable preserving the important aspects of the scene has been necessary. Currently, with the new low-cost RGB-D sensors, which provide a stream of color and 3D data of approximately 30 frames per second, this is getting more relevance. Many applications make use of these sensors and need a preprocessing to downsample the data in order to either reduce the processing time or improve the data (e.g., reducing noise or enhancing the important features). In this paper, we present a comparison of different downsampling techniques which are based on different principles. Concretely, five different downsampling methods are included: a bilinear-based method, a normal-based, a color-based, a combination of the normal and color-based samplings, and a growing neural gas (GNG)-based approach. For the comparison, two different models have been used acquired with the Blensor software. Moreover, to evaluate the effect of the downsampling in a real application, a 3D non-rigid registration is performed with the data sampled. From the experimentation we can conclude that depending on the purpose of the application some kernels of the sampling methods can improve drastically the results. Bilinear- and GNG-based methods provide homogeneous point clouds, but color-based and normal-based provide datasets with higher density of points in areas with specific features. In the non-rigid application, if a color-based sampled point cloud is used, it is possible to properly register two datasets for cases where intensity data are relevant in the model and outperform the results if only a homogeneous sampling is used.
Resumo:
In many classification problems, it is necessary to consider the specific location of an n-dimensional space from which features have been calculated. For example, considering the location of features extracted from specific areas of a two-dimensional space, as an image, could improve the understanding of a scene for a video surveillance system. In the same way, the same features extracted from different locations could mean different actions for a 3D HCI system. In this paper, we present a self-organizing feature map able to preserve the topology of locations of an n-dimensional space in which the vector of features have been extracted. The main contribution is to implicitly preserving the topology of the original space because considering the locations of the extracted features and their topology could ease the solution to certain problems. Specifically, the paper proposes the n-dimensional constrained self-organizing map preserving the input topology (nD-SOM-PINT). Features in adjacent areas of the n-dimensional space, used to extract the feature vectors, are explicitly in adjacent areas of the nD-SOM-PINT constraining the neural network structure and learning. As a study case, the neural network has been instantiate to represent and classify features as trajectories extracted from a sequence of images into a high level of semantic understanding. Experiments have been thoroughly carried out using the CAVIAR datasets (Corridor, Frontal and Inria) taken into account the global behaviour of an individual in order to validate the ability to preserve the topology of the two-dimensional space to obtain high-performance classification for trajectory classification in contrast of non-considering the location of features. Moreover, a brief example has been included to focus on validate the nD-SOM-PINT proposal in other domain than the individual trajectory. Results confirm the high accuracy of the nD-SOM-PINT outperforming previous methods aimed to classify the same datasets.
Resumo:
BACKGROUND Contrast-enhanced (ce) fluid-attenuated inversion recovery magnetic resonance imaging (FLAIR MRI) has recently been shown to identify leptomeningeal pathology in multiple sclerosis. OBJECTIVE To demonstrate leptomeningeal enhancement on three-dimensional (3D) FLAIR in a case of Susac's syndrome. METHODS Leptomeningeal enhancement was correlated with clinical activity over 20 months and compared to retinal fluorescein angiography. RESULTS The size, number, and location of leptomeningeal enhancement varied over time and generally correlated with symptom severity. The appearance was remarkably similar to that of retinal vasculopathy. CONCLUSION Ce 3D FLAIR may aid in diagnosis and understanding of pathophysiology in Susac's syndrome and may serve as a biomarker for disease activity.
Resumo:
Despite the insight gained from 2-D particle models, and given that the dynamics of crustal faults occur in 3-D space, the question remains, how do the 3-D fault gouge dynamics differ from those in 2-D? Traditionally, 2-D modeling has been preferred over 3-D simulations because of the computational cost of solving 3-D problems. However, modern high performance computing architectures, combined with a parallel implementation of the Lattice Solid Model (LSM), provide the opportunity to explore 3-D fault micro-mechanics and to advance understanding of effective constitutive relations of fault gouge layers. In this paper, macroscopic friction values from 2-D and 3-D LSM simulations, performed on an SGI Altix 3700 super-cluster, are compared. Two rectangular elastic blocks of bonded particles, with a rough fault plane and separated by a region of randomly sized non-bonded gouge particles, are sheared in opposite directions by normally-loaded driving plates. The results demonstrate that the gouge particles in the 3-D models undergo significant out-of-plane motion during shear. The 3-D models also exhibit a higher mean macroscopic friction than the 2-D models for varying values of interparticle friction. 2-D LSM gouge models have previously been shown to exhibit accelerating energy release in simulated earthquake cycles, supporting the Critical Point hypothesis. The 3-D models are shown to also display accelerating energy release, and good fits of power law time-to-failure functions to the cumulative energy release are obtained.
Resumo:
The use of 3D visualisation of digital information is a recent phenomenon. It relies on users understanding 3D perspectival spaces. Questions about the universal access of such spaces has been debated since its inception in the European Renaissance. Perspective has since become a strong cultural influence in Western visual communication. Perspective imaging assists the process of experimenting by the sketching or modelling of ideas. In particular, the recent 3D modelling of an essentially non-dimensional Cyber-space raises questions of how we think about information in general. While alternate methods clearly exist they are rarely explored within the 3D paradigm (such as Chinese isometry). This paper seeks to generate further discussion on the historical background of perspective and its role in underpinning this emergent field. © 2005 IEEE.
Resumo:
The adult human intervertebral disc (IVD) is normally avascular. Changes to the extracellular matrix in degenerative disc disease may promote vascularisation and subsequently alter cell nutrition and disc homeostasis. This study examines the influence of cell density and the presence of glucose and serum on the proliferation and survival of IVD cells in 3D culture. Bovine nucleus pulposus (NP) cells were seeded at a range of cell densities (1.25 × 10(5)-10(6) cells/mL) and cultured in alginate beads under standard culture conditions (with 3.15 g/L glucose and 10 % serum), or without glucose and/or 20% serum. Cell proliferation, apoptosis and cell senescence were examined after 8 days in culture. Under standard culture conditions, NP cell proliferation and cluster formation was inversely related to cell seeding density, whilst the number of apoptotic cells and enucleated "ghost" cells was positively correlated to cell seeding density. Increasing serum levels from 10% to 20% was associated with increased cluster size and also an increased prevalence of apoptotic cells within clusters. Omitting glucose produced even larger clusters and also more apoptotic and senescent cells. These studies demonstrate that NP cell growth and survival are influenced both by cell density and the availability of serum or nutrients, such as glucose. The observation of clustered, senescent, apoptotic or "ghost" cells in vitro suggests that environmental factors may influence the formation of these phenotypes that have been previously reported in vivo. Hence this study has implications for both our understanding of degenerative disc disease and also cell-based therapy using cells cultured in vitro.
Resumo:
This paper presents the main concepts of a project under development concerning the analysis process of a scene containing a large number of objects, represented as unstructured point clouds. To achieve what we called the "optimal scene interpretation" (the shortest scene description satisfying the MDL principle) we follow an approach for managing 3-D objects based on a semantic framework based on ontologies for adding and sharing conceptual knowledge about spatial objects.
Resumo:
This paper presents a novel algorithm for medial surfaces extraction that is based on the density-corrected Hamiltonian analysis of Torsello and Hancock [1]. In order to cope with the exponential growth of the number of voxels, we compute a first coarse discretization of the mesh which is iteratively refined until a desired resolution is achieved. The refinement criterion relies on the analysis of the momentum field, where only the voxels with a suitable value of the divergence are exploded to a lower level of the hierarchy. In order to compensate for the discretization errors incurred at the coarser levels, a dilation procedure is added at the end of each iteration. Finally we design a simple alignment procedure to correct the displacement of the extracted skeleton with respect to the true underlying medial surface. We evaluate the proposed approach with an extensive series of qualitative and quantitative experiments. © 2013 Elsevier Inc. All rights reserved.
Resumo:
3D Reconstruction is the process used to obtain a detailed graphical model in three dimensions that represents some real objectified scene. This process uses sequences of images taken from the scene, so it can automatically extract the information about the depth of feature points. These points are then highlighted using some computational technique on the images that compose the used dataset. Using SURF feature points this work propose a model for obtaining depth information of feature points detected by the system. At the ending, the proposed system extract three important information from the images dataset: the 3D position for feature points; relative rotation and translation matrices between images; the realtion between the baseline for adjacent images and the 3D point accuracy error found.
Resumo:
Moving through a stable, three-dimensional world is a hallmark of our motor and perceptual experience. This stability is constantly being challenged by movements of the eyes and head, inducing retinal blur and retino-spatial misalignments for which the brain must compensate. To do so, the brain must account for eye and head kinematics to transform two-dimensional retinal input into the reference frame necessary for movement or perception. The four studies in this thesis used both computational and psychophysical approaches to investigate several aspects of this reference frame transformation. In the first study, we examined the neural mechanism underlying the visuomotor transformation for smooth pursuit using a feedforward neural network model. After training, the model performed the general, three-dimensional transformation using gain modulation. This gave mechanistic significance to gain modulation observed in cortical pursuit areas while also providing several testable hypotheses for future electrophysiological work. In the second study, we asked how anticipatory pursuit, which is driven by memorized signals, accounts for eye and head geometry using a novel head-roll updating paradigm. We showed that the velocity memory driving anticipatory smooth pursuit relies on retinal signals, but is updated for the current head orientation. In the third study, we asked how forcing retinal motion to undergo a reference frame transformation influences perceptual decision making. We found that simply rolling one's head impairs perceptual decision making in a way captured by stochastic reference frame transformations. In the final study, we asked how torsional shifts of the retinal projection occurring with almost every eye movement influence orientation perception across saccades. We found a pre-saccadic, predictive remapping consistent with maintaining a purely retinal (but spatially inaccurate) orientation perception throughout the movement. Together these studies suggest that, despite their spatial inaccuracy, retinal signals play a surprisingly large role in our seamless visual experience. This work therefore represents a significant advance in our understanding of how the brain performs one of its most fundamental functions.