892 resultados para SIFT,Computer Vision,Python,Object Recognition,Feature Detection,Descriptor Computation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The perception of global form requires integration of local visual cues across space and is the foundation for object recognition. Here we used magnetoencephalography (MEG) to study the location and time course of neuronal activity associated with the perception of global structure from local image features. To minimize neuronal activity to low-level stimulus properties, such as luminance and contrast, the local image features were held constant during all phases of the MEG recording. This allowed us to assess the relative importance of striate (V1) versus extrastriate cortex in global form perception.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ability to recognize individual faces is of crucial social importance for humans and evolutionarily necessary for survival. Consequently, faces may be “special” stimuli, for which we have developed unique modular perceptual and recognition processes. Some of the strongest evidence for face processing being modular comes from cases of prosopagnosia, where patients are unable to recognize faces whilst retaining the ability to recognize other objects. Here we present the case of an acquired prosopagnosic whose poor recognition was linked to a perceptual impairment in face processing. Despite this, she had intact object recognition, even at a subordinate level. She also showed a normal ability to learn and to generalize learning of nonfacial exemplars differing in the nature and arrangement of their parts, along with impaired learning and generalization of facial exemplars. The case provides evidence for modular perceptual processes for faces.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image segmentation is one of the most computationally intensive operations in image processing and computer vision. This is because a large volume of data is involved and many different features have to be extracted from the image data. This thesis is concerned with the investigation of practical issues related to the implementation of several classes of image segmentation algorithms on parallel architectures. The Transputer is used as the basic building block of hardware architectures and Occam is used as the programming language. The segmentation methods chosen for implementation are convolution, for edge-based segmentation; the Split and Merge algorithm for segmenting non-textured regions; and the Granlund method for segmentation of textured images. Three different convolution methods have been implemented. The direct method of convolution, carried out in the spatial domain, uses the array architecture. The other two methods, based on convolution in the frequency domain, require the use of the two-dimensional Fourier transform. Parallel implementations of two different Fast Fourier Transform algorithms have been developed, incorporating original solutions. For the Row-Column method the array architecture has been adopted, and for the Vector-Radix method, the pyramid architecture. The texture segmentation algorithm, for which a system-level design is given, demonstrates a further application of the Vector-Radix Fourier transform. A novel concurrent version of the quad-tree based Split and Merge algorithm has been implemented on the pyramid architecture. The performance of the developed parallel implementations is analysed. Many of the obtained speed-up and efficiency measures show values close to their respective theoretical maxima. Where appropriate comparisons are drawn between different implementations. The thesis concludes with comments on general issues related to the use of the Transputer system as a development tool for image processing applications; and on the issues related to the engineering of concurrent image processing applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The dramatic effects of brain damage can provide some of the most interesting insights into the nature of normal cognitive performance. In recent years a number of neuropsychological studies have reported a particular form of cognitive impairment where patients have problems recognising objects from one category but remain able to recognise those from others. The most frequent ‘category-specific’ pattern is an impairment identifying living things, compared to nonliving things. The reverse pattern of dissociation, i.e., an impairment recognising and naming nonliving things relative to living things, has been reported albeit much less frequently. The objective of the work carried out in this thesis was to investigate the organising principles and anatomical correlates of stored knowledge for categories of living and nonliving things. Three complementary cognitive neuropsychological research techniques were employed to assess how, and where, this knowledge is represented in the brain: (i) studies of normal (neurologically intact) subjects, (ii) case-studies of neurologically impaired patients with selective deficits in object recognition, and (iii) studies of the anatomical correlates of stored knowledge for living and nonliving things on the brain using magnetoencephalography (MEG). The main empirical findings showed that semantic knowledge about living and nonliving things is principally encoded in terms of sensory and functional features, respectively. In two case-study chapters evidence was found supporting the view that category-specific impairments can arise from damage to a pre-semantic system, rather than the assumption often made that the system involved must be semantic. In the MEG study, rather than finding evidence for the involvement of specific brain areas for different object categories, it appeared that, when subjects named and categorised living and nonliving things, a non-differentiated neural system was involved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the problem of obtaining 3d detailed reconstructions of human faces in real-time and with inexpensive hardware. We present an algorithm based on a monocular multi-spectral photometric-stereo setup. This system is known to capture high-detailed deforming 3d surfaces at high frame rates and without having to use any expensive hardware or synchronized light stage. However, the main challenge of such a setup is the calibration stage, which depends on the lights setup and how they interact with the specific material being captured, in this case, human faces. For this purpose we develop a self-calibration technique where the person being captured is asked to perform a rigid motion in front of the camera, maintaining a neutral expression. Rigidity constrains are then used to compute the head's motion with a structure-from-motion algorithm. Once the motion is obtained, a multi-view stereo algorithm reconstructs a coarse 3d model of the face. This coarse model is then used to estimate the lighting parameters with a stratified approach: In the first step we use a RANSAC search to identify purely diffuse points on the face and to simultaneously estimate this diffuse reflectance model. In the second step we apply non-linear optimization to fit a non-Lambertian reflectance model to the outliers of the previous step. The calibration procedure is validated with synthetic and real data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a probabilistic, online, depth map fusion framework, whose generative model for the sensor measurement process accurately incorporates both long-range visibility constraints and a spatially varying, probabilistic outlier model. In addition, we propose an inference algorithm that updates the state variables of this model in linear time each frame. Our detailed evaluation compares our approach against several others, demonstrating and explaining the improvements that this model offers, as well as highlighting a problem with all current methods: systemic bias. © 2012 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A sizeable amount of the testing in eye care, requires either the identification of targets such as letters to assess functional vision, or the subjective evaluation of imagery by an examiner. Computers can render a variety of different targets on their monitors and can be used to store and analyse ophthalmic images. However, existing computing hardware tends to be large, screen resolutions are often too low, and objective assessments of ophthalmic images unreliable. Recent advances in mobile computing hardware and computer-vision systems can be used to enhance clinical testing in optometry. High resolution touch screens embedded in mobile devices, can render targets at a wide variety of distances and can be used to record and respond to patient responses, automating testing methods. This has opened up new opportunities in computerised near vision testing. Equally, new image processing techniques can be used to increase the validity and reliability of objective computer vision systems. Three novel apps for assessing reading speed, contrast sensitivity and amplitude of accommodation were created by the author to demonstrate the potential of mobile computing to enhance clinical measurement. The reading speed app could present sentences effectively, control illumination and automate the testing procedure for reading speed assessment. Meanwhile the contrast sensitivity app made use of a bit stealing technique and swept frequency target, to rapidly assess a patient’s full contrast sensitivity function at both near and far distances. Finally, customised electronic hardware was created and interfaced to an app on a smartphone device to allow free space amplitude of accommodation measurement. A new geometrical model of the tear film and a ray tracing simulation of a Placido disc topographer were produced to provide insights on the effect of tear film breakdown on ophthalmic images. Furthermore, a new computer vision system, that used a novel eye-lash segmentation technique, was created to demonstrate the potential of computer vision systems for the clinical assessment of tear stability. Studies undertaken by the author to assess the validity and repeatability of the novel apps, found that their repeatability was comparable to, or better, than existing clinical methods for reading speed and contrast sensitivity assessment. Furthermore, the apps offered reduced examination times in comparison to their paper based equivalents. The reading speed and amplitude of accommodation apps correlated highly with existing methods of assessment supporting their validity. Their still remains questions over the validity of using a swept frequency sine-wave target to assess patient’s contrast sensitivity functions as no clinical test provides the range of spatial frequencies and contrasts, nor equivalent assessment at distance and near. A validation study of the new computer vision system found that the authors tear metric correlated better with existing subjective measures of tear film stability than those of a competing computer-vision system. However, repeatability was poor in comparison to the subjective measures due to eye lash interference. The new mobile apps, computer vision system, and studies outlined in this thesis provide further insight into the potential of applying mobile and image processing technology to enhance clinical testing by eye care professionals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Smart cameras allow pre-processing of video data on the camera instead of sending it to a remote server for further analysis. Having a network of smart cameras allows various vision tasks to be processed in a distributed fashion. While cameras may have different tasks, we concentrate on distributed tracking in smart camera networks. This application introduces various highly interesting problems. Firstly, how can conflicting goals be satisfied such as cameras in the network try to track objects while also trying to keep communication overhead low? Secondly, how can cameras in the network self adapt in response to the behavior of objects and changes in scenarios, to ensure continued efficient performance? Thirdly, how can cameras organise themselves to improve the overall network's performance and efficiency? This paper presents a simulation environment, called CamSim, allowing distributed self-adaptation and self-organisation algorithms to be tested, without setting up a physical smart camera network. The simulation tool is written in Java and hence allows high portability between different operating systems. Relaxing various problems of computer vision and network communication enables a focus on implementing and testing new self-adaptation and self-organisation algorithms for cameras to use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

On the basis of convolutional (Hamming) version of recent Neural Network Assembly Memory Model (NNAMM) for intact two-layer autoassociative Hopfield network optimal receiver operating characteristics (ROCs) have been derived analytically. A method of taking into account explicitly a priori probabilities of alternative hypotheses on the structure of information initiating memory trace retrieval and modified ROCs (mROCs, a posteriori probabilities of correct recall vs. false alarm probability) are introduced. The comparison of empirical and calculated ROCs (or mROCs) demonstrates that they coincide quantitatively and in this way intensities of cues used in appropriate experiments may be estimated. It has been found that basic ROC properties which are one of experimental findings underpinning dual-process models of recognition memory can be explained within our one-factor NNAMM.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel algorithm for medial surfaces extraction that is based on the density-corrected Hamiltonian analysis of Torsello and Hancock [1]. In order to cope with the exponential growth of the number of voxels, we compute a first coarse discretization of the mesh which is iteratively refined until a desired resolution is achieved. The refinement criterion relies on the analysis of the momentum field, where only the voxels with a suitable value of the divergence are exploded to a lower level of the hierarchy. In order to compensate for the discretization errors incurred at the coarser levels, a dilation procedure is added at the end of each iteration. Finally we design a simple alignment procedure to correct the displacement of the extracted skeleton with respect to the true underlying medial surface. We evaluate the proposed approach with an extensive series of qualitative and quantitative experiments. © 2013 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The seminal multiple-view stereo benchmark evaluations from Middlebury and by Strecha et al. have played a major role in propelling the development of multi-view stereopsis (MVS) methodology. The somewhat small size and variability of these data sets, however, limit their scope and the conclusions that can be derived from them. To facilitate further development within MVS, we here present a new and varied data set consisting of 80 scenes, seen from 49 or 64 accurate camera positions. This is accompanied by accurate structured light scans for reference and evaluation. In addition all images are taken under seven different lighting conditions. As a benchmark and to validate the use of our data set for obtaining reasonable and statistically significant findings about MVS, we have applied the three state-of-the-art MVS algorithms by Campbell et al., Furukawa et al., and Tola et al. to the data set. To do this we have extended the evaluation protocol from the Middlebury evaluation, necessitated by the more complex geometry of some of our scenes. The data set and accompanying evaluation framework are made freely available online. Based on this evaluation, we are able to observe several characteristics of state-of-the-art MVS, e.g. that there is a tradeoff between the quality of the reconstructed 3D points (accuracy) and how much of an object’s surface is captured (completeness). Also, several issues that we hypothesized would challenge MVS, such as specularities and changing lighting conditions did not pose serious problems. Our study finds that the two most pressing issues for MVS are lack of texture and meshing (forming 3D points into closed triangulated surfaces).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Learning and memory in adult females decline during menopause and estrogen replacement therapy is commonly prescribed during menopause. Post-menopausal women tend to suffer from depression and are prescribed antidepressants – in addition to hormone therapy. Estrogen replacement therapy is a topic that engenders debate since several studies contradict its efficacy as a palliative therapy for cognitive decline and neurodegenerative diseases. Signaling transduction pathways can alter brain cell activity, survival, and morphology by facilitating transcription factor DNA binding and protein production. The steroidal hormone estrogen and the anti-depressant drug lithium interact through these signaling transduction pathways facilitating transcription factor activation. The paucity of data on how combined hormones and antidepressants interact in regulating gene expression led me to hypothesize that in primary mixed brain cell cultures, combined 17β-estradiol (E2) and lithium chloride (LiCl) (E2/LiCl) will alter genetic expression of markers involved in synaptic plasticity and neuroprotection. Results from these studies indicated that a 48 h treatment of E2/LiCl reduced glutamate receptor subunit genetic expression, but increased neurotrophic factor and estrogen receptor genetic expression. Combined treatment also failed to protect brain cell cultures from glutamate excitotoxicity. If lithium facilitates protein signaling pathways mediated by estrogen, can lithium alone serve as a palliative treatment for post-menopause? This question led me to hypothesize that in estrogen-deficient mice, lithium alone will increase episodic memory (tested via object recognition), and enhance expression in the brain of factors involved in anti-apoptosis, learning and memory. I used bilaterally ovariectomized (bOVX) C57BL/6J mice treated with LiCl for one month. Results indicated that LiCl-treated bOVX mice increased performance in object recognition compared with non-treated bOVX. Increased performance in LiCl-treated bOVX mice coincided with augmented genetic and protein expression in the brain. Understanding the molecular pathways of estrogen will assist in identifying a palliative therapy for menopause-related dementia, and lithium may serve this purpose by acting as a selective estrogen-mediated signaling modulator.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today, most conventional surveillance networks are based on analog system, which has a lot of constraints like manpower and high-bandwidth requirements. It becomes the barrier for today's surveillance network development. This dissertation describes a digital surveillance network architecture based on the H.264 coding/decoding (CODEC) System-on-a-Chip (SoC) platform. The proposed digital surveillance network architecture includes three major layers: software layer, hardware layer, and the network layer. The following outlines the contributions to the proposed digital surveillance network architecture. (1) We implement an object recognition system and an object categorization system on the software layer by applying several Digital Image Processing (DIP) algorithms. (2) For better compression ratio and higher video quality transfer, we implement two new modules on the hardware layer of the H.264 CODEC core, i.e., the background elimination module and the Directional Discrete Cosine Transform (DDCT) module. (3) Furthermore, we introduce a Digital Signal Processor (DSP) sub-system on the main bus of H.264 SoC platforms as the major hardware support system for our software architecture. Thus we combine the software and hardware platforms to be an intelligent surveillance node. Lab results show that the proposed surveillance node can dramatically save the network resources like bandwidth and storage capacity.