956 resultados para Multi-View Rendering


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hip fracture is the leading cause of acute orthopaedic hospital admission amongst the elderly, with around a third of patients not surviving one year post-fracture. Although various preventative therapies are available, patient selection is difficult. The current state-of-the-art risk assessment tool (FRAX) ignores focal structural defects, such as cortical bone thinning, a critical component in characterizing hip fragility. Cortical thickness can be measured using CT, but this is expensive and involves a significant radiation dose. Instead, Dual-Energy X-ray Absorptiometry (DXA) is currently the preferred imaging modality for assessing hip fracture risk and is used routinely in clinical practice. Our ambition is to develop a tool to measure cortical thickness using multi-view DXA instead of CT. In this initial study, we work with digitally reconstructed radiographs (DRRs) derived from CT data as a surrogate for DXA scans: this enables us to compare directly the thickness estimates with the gold standard CT results. Our approach involves a model-based femoral shape reconstruction followed by a data-driven algorithm to extract numerous cortical thickness point estimates. In a series of experiments on the shaft and trochanteric regions of 48 proximal femurs, we validated our algorithm and established its performance limits using 20 views in the range 0°-171°: estimation errors were 0:19 ± 0:53mm (mean +/- one standard deviation). In a more clinically viable protocol using four views in the range 0°-51°, where no other bony structures obstruct the projection of the femur, measurement errors were -0:07 ± 0:79 mm. © 2013 SPIE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For the first time in this paper we present results showing the effect of speaker head pose angle on automatic lip-reading performance over a wide range of closely spaced angles. We analyse the effect head pose has upon the features themselves and show that by selecting coefficients with minimum variance w.r.t. pose angle, recognition performance can be improved when train-test pose angles differ. Experiments are conducted using the initial phase of a unique multi view Audio-Visual database designed specifically for research and development of pose-invariant lip-reading systems. We firstly show that it is the higher order horizontal spatial frequency components that become most detrimental as the pose deviates. Secondly we assess the performance of different feature selection masks across a range of pose angles including a new mask based on Minimum Cross-Pose Variance coefficients. We report a relative improvement of 50% in Word Error Rate when using our selection mask over a common energy based selection during profile view lip-reading.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ce mémoire s'intéresse à la reconstruction d'un modèle 3D à partir de plusieurs images. Le modèle 3D est élaboré avec une représentation hiérarchique de voxels sous la forme d'un octree. Un cube englobant le modèle 3D est calculé à partir de la position des caméras. Ce cube contient les voxels et il définit la position de caméras virtuelles. Le modèle 3D est initialisé par une enveloppe convexe basée sur la couleur uniforme du fond des images. Cette enveloppe permet de creuser la périphérie du modèle 3D. Ensuite un coût pondéré est calculé pour évaluer la qualité de chaque voxel à faire partie de la surface de l'objet. Ce coût tient compte de la similarité des pixels provenant de chaque image associée à la caméra virtuelle. Finalement et pour chacune des caméras virtuelles, une surface est calculée basée sur le coût en utilisant la méthode de SGM. La méthode SGM tient compte du voisinage lors du calcul de profondeur et ce mémoire présente une variation de la méthode pour tenir compte des voxels précédemment exclus du modèle par l'étape d'initialisation ou de creusage par une autre surface. Par la suite, les surfaces calculées sont utilisées pour creuser et finaliser le modèle 3D. Ce mémoire présente une combinaison innovante d'étapes permettant de créer un modèle 3D basé sur un ensemble d'images existant ou encore sur une suite d'images capturées en série pouvant mener à la création d'un modèle 3D en temps réel.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is often assumed that humans generate a 3D reconstruction of the environment, either in egocentric or world-based coordinates, but the steps involved are unknown. Here, we propose two reconstruction-based models, evaluated using data from two tasks in immersive virtual reality. We model the observer’s prediction of landmark location based on standard photogrammetric methods and then combine location predictions to compute likelihood maps of navigation behaviour. In one model, each scene point is treated independently in the reconstruction; in the other, the pertinent variable is the spatial relationship between pairs of points. Participants viewed a simple environment from one location, were transported (virtually) to another part of the scene and were asked to navigate back. Error distributions varied substantially with changes in scene layout; we compared these directly with the likelihood maps to quantify the success of the models. We also measured error distributions when participants manipulated the location of a landmark to match the preceding interval, providing a direct test of the landmark-location stage of the navigation models. Models such as this, which start with scenes and end with a probabilistic prediction of behaviour, are likely to be increasingly useful for understanding 3D vision.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biologically human brain processes information in both uniimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is
demonstrated and discussed with some experimental results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The human brain processes information in both unimodal and multimodal fashion where information is progressively captured, accumulated, abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has produced various sources of electronic data and continues to do so exponentially. Finding patterns from such multi-source and multimodal data could be compared to the multimodal and multidimensional information processing in the human brain. Therefore, such brain functionality could be taken as an inspiration to develop a methodology for exploring multimodal and multi-source electronic data and further identifying multi-view patterns. In this paper, we first propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. Secondly, we present a cluster driven approach for the implementation of the proposed brain inspired model. Particularly, the Growing Self Organising Maps (GSOM) based cross-clustering approach is discussed. Furthermore, the acquisition of multi-view patterns with clusters driven implementation is demonstrated with experimental results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Humans perceive entities such as objects, patterns, events, etc. as concepts, which are the basic units in human intelligence and communications. In addition, perceptions of these entities could be abstracted and generalised at multiple levels of granularity. In particular, such granulation allows the formation and usage of concepts in human intelligence. Such natural granularity in human intelligence could inspire and motivate the design and development of pattern identification approach in Data Mining. In our opinion, a pattern could be perceived at multiple levels of granularity and thus we advocate for the co-existence of hierarchy and granularity. In addition, granular patterns exist across different sources of data (multimodality). In this paper, we present a cognitive model that incorporates the characteristics of Hierarchy, Granularity and Multimodality for multi-view patterns identification in crime domain. Such framework is implemented with Growing Self Organising Maps (GSOM) and some experimental results are presented and discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the massive amount of crime data generated daily, this has put law enforcement under intensive stress. This means that law enforcement has to compete against the time to solve crime. In addition, the focus of crime investigation has been expanded from the ability to catch the criminals towards the ability to act before a crime happens (i.e pre-crime). Given such situation, creation of crime profiles is very important to law enforcement, especially in understanding the behaviours of criminals and identifying the characteristics of similar crimes. In fact, crime profiles could be used to solve similar crimes and thus pre-crime action could be conducted. In this paper, a brain inspired conceptual model is proposed and a structurally adaptive neural network is deployed for its implementation. Subsequently, the proposed model is applied for the identification and presentation of multi-view crime patterns. Such multi-view crime patterns could be useful for the construction of crime profiles. Moreover, the suitability of the proposed model in crime profiling is discussed and demonstrated through some experimental results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal of email classification is to classify user emails into spam and legitimate ones. Many supervised learning algorithms have been invented in this domain to accomplish the task, and these algorithms require a large number of labeled training data. However, data labeling is a labor intensive task and requires in-depth domain knowledge. Thus, only a very small proportion of the data can be labeled in practice. This bottleneck greatly degrades the effectiveness of supervised email classification systems. In order to address this problem, in this work, we first identify some critical issues regarding supervised machine learning-based email classification. Then we propose an effective classification model based on multi-view disagreement-based semi-supervised learning. The motivation behind the attempt of using multi-view and semi-supervised learning is that multi-view can provide richer information for classification, which is often ignored by literature, and semi-supervised learning supplies with the capability of coping with labeled and unlabeled data. In the evaluation, we demonstrate that the multi-view data can improve the email classification than using a single view data, and that the proposed model working with our algorithm can achieve better performance as compared to the existing similar algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Face recognition with multiple views is a challenging research problem. Most of the existing works have focused on extracting shared information among multiple views to improve recognition. However, when the pose variation is too large or missing, 'shared information' may not be properly extracted, leading to poor recognition results. In this paper, we propose a novel method for face recognition with multiple view images to overcome the large pose variation and missing pose issue. By introducing a novel mixed norm, the proposed method automatically selects candidates from the gallery to best represent a group of highly correlated face images in a query set to improve classification accuracy. This mixed norm combines the advantages of both sparse representation based classification (SRC) and joint sparse representation based classification (JSRC). A trade off between the ℓ1-norm from SRC and ℓ2,1-norm from JSRC is introduced to achieve this goal. Due to this property, the proposed method decreases the influence when a face image is unseen and has large pose variation in the recognition process. And when some face images with a certain degree of unseen pose variation appear, this mixed norm will find an optimal representation for these query images based on the shared information induced from multiple views. Moreover, we also address an open problem in robust sparse representation and classification which is using ℓ1-norm on the loss function to achieve a robust solution. To solve this formulation, we derive a simple, yet provably convergent algorithm based on the powerful alternative directions method of multipliers (ADMM) framework. We provide extensive comparisons which demonstrate that our method outperforms other state-of-the-arts algorithms on CMU-PIE, Yale B and Multi-PIE databases for multi-view face recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many real-world computer vision applications, such as multi-camera surveillance, the objects of interest are captured by visual sensors concurrently, resulting in multi-view data. These views usually provide complementary information to each other. One recent and powerful computer vision method for clustering is sparse subspace clustering (SSC); however, it was not designed for multi-view data, which break down its linear separability assumption. To integrate complementary information between views, multi-view clustering algorithms are required to improve the clustering performance. In this paper, we propose a novel multi-view subspace clustering by searching for an unified latent structure as a global affinity matrix in subspace clustering. Due to the integration of affinity matrices for each view, this global affinity matrix can best represent the relationship between clusters. This could help us achieve better performance on face clustering. We derive a provably convergent algorithm based on the alternating direction method of multipliers (ADMM) framework, which is computationally efficient, to solve the formulation. We demonstrate that this formulation outperforms other alternatives based on state-of-The-Arts on challenging multi-view face datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

These slides present several 3-D reconstruction methods to obtain the geometric structure of a scene that is viewed by multiple cameras. We focus on the combination of the geometric modeling in the image formation process with the use of standard optimization tools to estimate the characteristic parameters that describe the geometry of the 3-D scene. In particular, linear, non-linear and robust methods to estimate the monocular and epipolar geometry are introduced as cornerstones to generate 3-D reconstructions with multiple cameras. Some examples of systems that use this constructive strategy are Bundler, PhotoSynth, VideoSurfing, etc., which are able to obtain 3-D reconstructions with several hundreds or thousands of cameras. En esta presentación se tratan varios métodos de reconstrucción 3-D para la obtención de la estructura geométrica de una escena que es visualizada por varias cámaras. Se enfatiza la combinación de modelado geométrico del proceso de formación de la imagen con el uso de herramientas estándar de optimización para estimar los parámetros característicos que describen la geometría de la escena 3-D. En concreto, se presentan métodos de estimación lineales, no lineales y robustos de las geometrías monocular y epipolar como punto de partida para generar reconstrucciones con tres o más cámaras. Algunos ejemplos de sistemas que utilizan este enfoque constructivo son Bundler, PhotoSynth, VideoSurfing, etc., los cuales, en la práctica pueden llegar a reconstruir una escena con varios cientos o miles de cámaras.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multi-view microscopy techniques such as Light-Sheet Fluorescence Microscopy (LSFM) are powerful tools for 3D + time studies of live embryos in developmental biology. The sample is imaged from several points of view, acquiring a set of 3D views that are then combined or fused in order to overcome their individual limitations. Views fusion is still an open problem despite recent contributions in the field. We developed a wavelet-based multi-view fusion method that, due to wavelet decomposition properties, is able to combine the complementary directional information from all available views into a single volume. Our method is demonstrated on LSFM acquisitions from live sea urchin and zebrafish embryos. The fusion results show improved overall contrast and details when compared with any of the acquired volumes. The proposed method does not need knowledge of the system's point spread function (PSF) and performs better than other existing PSF independent fusion methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The electronic and mechanical media such as film, television, photography, offset, are just examples of how fast and important the technological development had become in society. Nevertheless the outcoming technologies and the continuous development had provided newer and better possibilities every time for having advanced services. Nowadays multi-view video has been developed with different tools and applications, having as main goal to be more innovative and bring within technical offerings in a friendly for all users in general, in terms of managing and accessibility (just internet connection is needed). The intention of all technologies is to generate an innovation in order to gain more users and start being popular, therefore is important to realize an implementation in this case. In such terms realizing about the outreach that Multi View Video, an importance to become more global in this days, an application that supports this aim such as the possibility of language selection within the use of a same scenario has been realized. Finally is important to point out that thanks to the Multi View Video's continuous progress in technology a more intercultural market will be reachable, making of it a shared society growth on the world's global development. � ��� ���� ������� ��� �� ��� ��� �������� ��� ���� ��� ��� ������ ���������� � ���� � �� ���� ���� � ���� �� � � ���� � � ��� ��� �� ��� �� � ��� ��� ��������� �� � ����� ��������� ��� � ��� � ���� ���� ����� ����������� ��� ��� �� � ������������� �� �������� �������� ������� ������� �� ����� �������� ��� � � �� ���� �������� ���� ����� �������� �������� �� ������ ���� �� � ����������� ������������� � � ��!��� � � � �� ������� ��� ��������"������ � �� ���������� �������� ��� �� ������ � ����� ����� ��� ��� �� � �� �� ���� �� ��� �� ���� � � � �� ��� ������ �� �� ��� �� �� ��� �� � �� ��� #�� ��� ������� � ��� �� � �� ������$������� � ��� ��� # ������� � ����� ����� �� ���� �% ���% �������� ��� ����� ����������� �� ������� �� � �� ������ ��� ���� �� ��� �� � ����� �� � �� � �� ����� ��� ��� ���� � � �� ��� ��������� ����� ��� � � �� ���������������������� ����������� ��� #����& ������ �� ��� �� � ���� � ��� � �� � ���'�� �� ��� ��� � % ��� % ���(�� ��� ������ � �� ���� �� ���������� ���� �� � � ��� � ����� '� �� ��� ��� ���������� ��' ������ ������ ������ � ��� �� ����� ����� ��(������������������� ��� � �