983 resultados para Multiple sparse cameras
Resumo:
A single picture provides a largely incomplete representation of the scene one is looking at. Usually it reproduces only a limited spatial portion of the scene according to the standpoint and the viewing angle, besides it contains only instantaneous information. Thus very little can be understood on the geometrical structure of the scene, the position and orientation of the observer with respect to it remaining also hard to guess. When multiple views, taken from different positions in space and time, observe the same scene, then a much deeper knowledge is potentially achievable. Understanding inter-views relations enables construction of a collective representation by fusing the information contained in every single image. Visual reconstruction methods confront with the formidable, and still unanswered, challenge of delivering a comprehensive representation of structure, motion and appearance of a scene from visual information. Multi-view visual reconstruction deals with the inference of relations among multiple views and the exploitation of revealed connections to attain the best possible representation. This thesis investigates novel methods and applications in the field of visual reconstruction from multiple views. Three main threads of research have been pursued: dense geometric reconstruction, camera pose reconstruction, sparse geometric reconstruction of deformable surfaces. Dense geometric reconstruction aims at delivering the appearance of a scene at every single point. The construction of a large panoramic image from a set of traditional pictures has been extensively studied in the context of image mosaicing techniques. An original algorithm for sequential registration suitable for real-time applications has been conceived. The integration of the algorithm into a visual surveillance system has lead to robust and efficient motion detection with Pan-Tilt-Zoom cameras. Moreover, an evaluation methodology for quantitatively assessing and comparing image mosaicing algorithms has been devised and made available to the community. Camera pose reconstruction deals with the recovery of the camera trajectory across an image sequence. A novel mosaic-based pose reconstruction algorithm has been conceived that exploit image-mosaics and traditional pose estimation algorithms to deliver more accurate estimates. An innovative markerless vision-based human-machine interface has also been proposed, so as to allow a user to interact with a gaming applications by moving a hand held consumer grade camera in unstructured environments. Finally, sparse geometric reconstruction refers to the computation of the coarse geometry of an object at few preset points. In this thesis, an innovative shape reconstruction algorithm for deformable objects has been designed. A cooperation with the Solar Impulse project allowed to deploy the algorithm in a very challenging real-world scenario, i.e. the accurate measurements of airplane wings deformations.
Resumo:
Aims. Approach observations with the Optical, Spectroscopic, and Infrared Remote Imaging System (OSIRIS) experiment onboard Rosetta are used to determine the rotation period, the direction of the spin axis, and the state of rotation of comet 67P’s nucleus. Methods. Photometric time series of 67P have been acquired by OSIRIS since the post wake-up commissioning of the payload in March 2014. Fourier analysis and convex shape inversion methods have been applied to the Rosetta data as well to the available ground-based observations. Results. Evidence is found that the rotation rate of 67P has significantly changed near the time of its 2009 perihelion passage, probably due to sublimation-induced torque. We find that the sidereal rotation periods P1 = 12.76129 ± 0.00005 h and P2 = 12.4043 ± 0.0007 h for the apparitions before and after the 2009 perihelion, respectively, provide the best fit to the observations. No signs of multiple periodicity are found in the light curves down to the noise level, which implies that the comet is presently in a simple rotation state around its axis of largest moment of inertia. We derive a prograde rotation model with spin vector J2000 ecliptic coordinates λ = 65° ± 15°, β = + 59° ± 15°, corresponding to equatorial coordinates RA = 22°, Dec = + 76°. However, we find that the mirror solution, also prograde, at λ = 275° ± 15°, β = + 50° ± 15° (or RA = 274°, Dec = + 27°), is also possible at the same confidence level, due to the intrinsic ambiguity of the photometric problem for observations performed close to the ecliptic plane.
Resumo:
We present observations of total cloud cover and cloud type classification results from a sky camera network comprising four stations in Switzerland. In a comprehensive intercomparison study, records of total cloud cover from the sky camera, long-wave radiation observations, Meteosat, ceilometer, and visual observations were compared. Total cloud cover from the sky camera was in 65–85% of cases within ±1 okta with respect to the other methods. The sky camera overestimates cloudiness with respect to the other automatic techniques on average by up to 1.1 ± 2.8 oktas but underestimates it by 0.8 ± 1.9 oktas compared to the human observer. However, the bias depends on the cloudiness and therefore needs to be considered when records from various observational techniques are being homogenized. Cloud type classification was conducted using the k-Nearest Neighbor classifier in combination with a set of color and textural features. In addition, a radiative feature was introduced which improved the discrimination by up to 10%. The performance of the algorithm mainly depends on the atmospheric conditions, site-specific characteristics, the randomness of the selected images, and possible visual misclassifications: The mean success rate was 80–90% when the image only contained a single cloud class but dropped to 50–70% if the test images were completely randomly selected and multiple cloud classes occurred in the images.
Resumo:
In this paper we propose a novel fast random search clustering (RSC) algorithm for mixing matrix identification in multiple input multiple output (MIMO) linear blind inverse problems with sparse inputs. The proposed approach is based on the clustering of the observations around the directions given by the columns of the mixing matrix that occurs typically for sparse inputs. Exploiting this fact, the RSC algorithm proceeds by parameterizing the mixing matrix using hyperspherical coordinates, randomly selecting candidate basis vectors (i.e. clustering directions) from the observations, and accepting or rejecting them according to a binary hypothesis test based on the Neyman–Pearson criterion. The RSC algorithm is not tailored to any specific distribution for the sources, can deal with an arbitrary number of inputs and outputs (thus solving the difficult under-determined problem), and is applicable to both instantaneous and convolutive mixtures. Extensive simulations for synthetic and real data with different number of inputs and outputs, data size, sparsity factors of the inputs and signal to noise ratios confirm the good performance of the proposed approach under moderate/high signal to noise ratios. RESUMEN. Método de separación ciega de fuentes para señales dispersas basado en la identificación de la matriz de mezcla mediante técnicas de "clustering" aleatorio.
Resumo:
Atrial fibrillation (AF) is a common heart disorder. One of the most prominent hypothesis about its initiation and maintenance considers multiple uncoordinated activation foci inside the atrium. However, the implicit assumption behind all the signal processing techniques used for AF, such as dominant frequency and organization analysis, is the existence of a single regular component in the observed signals. In this paper we take into account the existence of multiple foci, performing a spectral analysis to detect their number and frequencies. In order to obtain a cleaner signal on which the spectral analysis can be performed, we introduce sparsity-aware learning techniques to infer the spike trains corresponding to the activations. The good performance of the proposed algorithm is demonstrated both on synthetic and real data. RESUMEN. Algoritmo basado en técnicas de regresión dispersa para la extracción de las señales cardiacas en pacientes con fibrilación atrial (AF).
Resumo:
A real-time surveillance system for IP network cameras is presented. Motion, part-body, and whole-body detectors are efficiently combined to generate robust and fast detections, which feed multiple compressive trackers. The generated trajectories are then improved using a reidentification strategy for long term operation.
Resumo:
In the recent years, the computer vision community has shown great interest on depth-based applications thanks to the performance and flexibility of the new generation of RGB-D imagery. In this paper, we present an efficient background subtraction algorithm based on the fusion of multiple region-based classifiers that processes depth and color data provided by RGB-D cameras. Foreground objects are detected by combining a region-based foreground prediction (based on depth data) with different background models (based on a Mixture of Gaussian algorithm) providing color and depth descriptions of the scene at pixel and region level. The information given by these modules is fused in a mixture of experts fashion to improve the foreground detection accuracy. The main contributions of the paper are the region-based models of both background and foreground, built from the depth and color data. The obtained results using different database sequences demonstrate that the proposed approach leads to a higher detection accuracy with respect to existing state-of-the-art techniques.
Resumo:
This paper discusses the target localization problem in wireless visual sensor networks. Additive noises and measurement errors will affect the accuracy of target localization when the visual nodes are equipped with low-resolution cameras. In the goal of improving the accuracy of target localization without prior knowledge of the target, each node extracts multiple feature points from images to represent the target at the sensor node level. A statistical method is presented to match the most correlated feature point pair for merging the position information of different sensor nodes at the base station. Besides, in the case that more than one target exists in the field of interest, a scheme for locating multiple targets is provided. Simulation results show that, our proposed method has desirable performance in improving the accuracy of locating single target or multiple targets. Results also show that the proposed method has a better trade-off between camera node usage and localization accuracy.
Resumo:
Many applications including object reconstruction, robot guidance, and. scene mapping require the registration of multiple views from a scene to generate a complete geometric and appearance model of it. In real situations, transformations between views are unknown and it is necessary to apply expert inference to estimate them. In the last few years, the emergence of low-cost depth-sensing cameras has strengthened the research on this topic, motivating a plethora of new applications. Although they have enough resolution and accuracy for many applications, some situations may not be solved with general state-of-the-art registration methods due to the signal-to-noise ratio (SNR) and the resolution of the data provided. The problem of working with low SNR data, in general terms, may appear in any 3D system, then it is necessary to propose novel solutions in this aspect. In this paper, we propose a method, μ-MAR, able to both coarse and fine register sets of 3D points provided by low-cost depth-sensing cameras, despite it is not restricted to these sensors, into a common coordinate system. The method is able to overcome the noisy data problem by means of using a model-based solution of multiplane registration. Specifically, it iteratively registers 3D markers composed by multiple planes extracted from points of multiple views of the scene. As the markers and the object of interest are static in the scenario, the transformations obtained for the markers are applied to the object in order to reconstruct it. Experiments have been performed using synthetic and real data. The synthetic data allows a qualitative and quantitative evaluation by means of visual inspection and Hausdorff distance respectively. The real data experiments show the performance of the proposal using data acquired by a Primesense Carmine RGB-D sensor. The method has been compared to several state-of-the-art methods. The results show the good performance of the μ-MAR to register objects with high accuracy in presence of noisy data outperforming the existing methods.
Resumo:
In recent years there has been an increased interest in applying non-parametric methods to real-world problems. Significant research has been devoted to Gaussian processes (GPs) due to their increased flexibility when compared with parametric models. These methods use Bayesian learning, which generally leads to analytically intractable posteriors. This thesis proposes a two-step solution to construct a probabilistic approximation to the posterior. In the first step we adapt the Bayesian online learning to GPs: the final approximation to the posterior is the result of propagating the first and second moments of intermediate posteriors obtained by combining a new example with the previous approximation. The propagation of em functional forms is solved by showing the existence of a parametrisation to posterior moments that uses combinations of the kernel function at the training points, transforming the Bayesian online learning of functions into a parametric formulation. The drawback is the prohibitive quadratic scaling of the number of parameters with the size of the data, making the method inapplicable to large datasets. The second step solves the problem of the exploding parameter size and makes GPs applicable to arbitrarily large datasets. The approximation is based on a measure of distance between two GPs, the KL-divergence between GPs. This second approximation is with a constrained GP in which only a small subset of the whole training dataset is used to represent the GP. This subset is called the em Basis Vector, or BV set and the resulting GP is a sparse approximation to the true posterior. As this sparsity is based on the KL-minimisation, it is probabilistic and independent of the way the posterior approximation from the first step is obtained. We combine the sparse approximation with an extension to the Bayesian online algorithm that allows multiple iterations for each input and thus approximating a batch solution. The resulting sparse learning algorithm is a generic one: for different problems we only change the likelihood. The algorithm is applied to a variety of problems and we examine its performance both on more classical regression and classification tasks and to the data-assimilation and a simple density estimation problems.
Resumo:
The rapidly increasing demand for cellular telephony is placing greater demand on the limited bandwidth resources available. This research is concerned with techniques which enhance the capacity of a Direct-Sequence Code-Division-Multiple-Access (DS-CDMA) mobile telephone network. The capacity of both Private Mobile Radio (PMR) and cellular networks are derived and the many techniques which are currently available are reviewed. Areas which may be further investigated are identified. One technique which is developed is the sectorisation of a cell into toroidal rings. This is shown to provide an increased system capacity when the cell is split into these concentric rings and this is compared with cell clustering and other sectorisation schemes. Another technique for increasing the capacity is achieved by adding to the amount of inherent randomness within the transmitted signal so that the system is better able to extract the wanted signal. A system model has been produced for a cellular DS-CDMA network and the results are presented for two possible strategies. One of these strategies is the variation of the chip duration over a signal bit period. Several different variation functions are tried and a sinusoidal function is shown to provide the greatest increase in the maximum number of system users for any given signal-to-noise ratio. The other strategy considered is the use of additive amplitude modulation together with data/chip phase-shift-keying. The amplitude variations are determined by a sparse code so that the average system power is held near its nominal level. This strategy is shown to provide no further capacity since the system is sensitive to amplitude variations. When both strategies are employed, however, the sensitivity to amplitude variations is shown to reduce, thus indicating that the first strategy both increases the capacity and the ability to handle fluctuations in the received signal power.
Resumo:
Optimal paths connecting randomly selected network nodes and fixed routers are studied analytically in the presence of a nonlinear overlap cost that penalizes congestion. Routing becomes more difficult as the number of selected nodes increases and exhibits ergodicity breaking in the case of multiple routers. The ground state of such systems reveals nonmonotonic complex behaviors in average path length and algorithmic convergence, depending on the network topology, and densities of communicating nodes and routers. A distributed linearly scalable routing algorithm is also devised. © 2012 American Physical Society.
Resumo:
Massive multi-user multiple-input multiple-output (MU-MIMO) systems are cellular networks where the base stations (BSs) are equipped with hundreds of antennas, N, and communicate with tens of mobile stations (MSs), K, such that, N ≫ K ≫ 1. Contrary to most prior works, in this paper, we consider the uplink of a single-cell massive MIMO system operating in sparse channels with limited scattering. This case is of particular importance in most propagation scenarios, where the prevalent Rayleigh fading assumption becomes idealistic. We derive analytical approximations for the achievable rates of maximum-ratio combining (MRC) and zero-forcing (ZF) receivers. Furthermore, we study the asymptotic behavior of the achievable rates for both MRC and ZF receivers, when N and K go to infinity under the condition that N/K → c ≥ 1. Our results indicate that the achievable rate of MRC receivers reaches an asymptotic saturation limit, whereas the achievable rate of ZF receivers grows logarithmically with the number of MSs.
Resumo:
In this paper, we demonstrate a digital signal processing (DSP) algorithm for improving spatial resolution of images captured by CMOS cameras. The basic approach is to reconstruct a high resolution (HR) image from a shift-related low resolution (LR) image sequence. The aliasing relationship of Fourier transforms between discrete and continuous images in the frequency domain is used for mapping LR images to a HR image. The method of projection onto convex sets (POCS) is applied to trace the best estimate of pixel matching from the LR images to the reconstructed HR image. Computer simulations and preliminary experimental results have shown that the algorithm works effectively on the application of post-image-captured processing for CMOS cameras. It can also be applied to HR digital image reconstruction, where shift information of the LR image sequence is known.
Resumo:
The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.