992 resultados para Bracciolini, Poggio, 1380-1459
Resumo:
In the first part of this paper we show that a new technique exploiting 1D correlation of 2D or even 1D patches between successive frames may be sufficient to compute a satisfactory estimation of the optical flow field. The algorithm is well-suited to VLSI implementations. The sparse measurements provided by the technique can be used to compute qualitative properties of the flow for a number of different visual tsks. In particular, the second part of the paper shows how to combine our 1D correlation technique with a scheme for detecting expansion or rotation ([5]) in a simple algorithm which also suggests interesting biological implications. The algorithm provides a rough estimate of time-to-crash. It was tested on real image sequences. We show its performance and compare the results to previous approaches.
Resumo:
This paper describes the main features of a view-based model of object recognition. The model tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network in which each of the hidden units is broadly tuned to a specific view of the object to be recognized.
Resumo:
How does the brain recognize three-dimensional objects? We trained monkeys to recognize computer rendered objects presented from an arbitrarily chosen training view, and subsequently tested their ability to generalize recognition for other views. Our results provide additional evidence in favor of with a recognition model that accomplishes view-invariant performance by storing a limited number of object views or templates together with the capacity to interpolate between the templates (Poggio and Edelman, 1990).
Resumo:
This paper consists of two major parts. First, we present the outline of a simple approach to very-low bandwidth video-conferencing system relying on an example-based hierarchical image compression scheme. In particular, we discuss the use of example images as a model, the number of required examples, faces as a class of semi-rigid objects, a hierarchical model based on decomposition into different time-scales, and the decomposition of face images into patches of interest. In the second part, we present several algorithms for image processing and animation as well as experimental evaluations. Among the original contributions of this paper is an automatic algorithm for pose estimation and normalization. We also review and compare different algorithms for finding the nearest neighbors in a database for a new input as well as a generalized algorithm for blending patches of interest in order to synthesize new images. Finally, we outline the possible integration of several algorithms to illustrate a simple model-based video-conference system.
Resumo:
The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced techniques that are applicable under restricted conditions but simpler. The approach exploits image transformations that are specific to the relevant object class and learnable from example views of other "prototypical" objects of the same class. In this paper, we introduce such a new technique by extending the notion of linear class first proposed by Poggio and Vetter. For linear object classes it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively "rotate" high- resolution face images from a single 2D view.
Resumo:
The inferior temporal cortex (IT) of monkeys is thought to play an essential role in visual object recognition. Inferotemporal neurons are known to respond to complex visual stimuli, including patterns like faces, hands, or other body parts. What is the role of such neurons in object recognition? The present study examines this question in combined psychophysical and electrophysiological experiments, in which monkeys learned to classify and recognize novel visual 3D objects. A population of neurons in IT were found to respond selectively to such objects that the monkeys had recently learned to recognize. A large majority of these cells discharged maximally for one view of the object, while their response fell off gradually as the object was rotated away from the neuron"s preferred view. Most neurons exhibited orientation-dependent responses also during view-plane rotations. Some neurons were found tuned around two views of the same object, while a very small number of cells responded in a view- invariant manner. For five different objects that were extensively used during the training of the animals, and for which behavioral performance became view-independent, multiple cells were found that were tuned around different views of the same object. No selective responses were ever encountered for views that the animal systematically failed to recognize. The results of our experiments suggest that neurons in this area can develop a complex receptive field organization as a consequence of extensive training in the discrimination and recognition of objects. Simple geometric features did not appear to account for the neurons" selective responses. These findings support the idea that a population of neurons -- each tuned to a different object aspect, and each showing a certain degree of invariance to image transformations -- may, as an assembly, encode complex 3D objects. In such a system, several neurons may be active for any given vantage point, with a single unit acting like a blurred template for a limited neighborhood of a single view.
Resumo:
If we are provided a face database with only one example view per person, is it possible to recognize new views of them under a variety of different poses, especially views rotated in depth from the original example view? We investigate using prior knowledge about faces plus each single example view to generate virtual views of each person, or views of the face as seen from different poses. Prior knowledge of faces is represented in an example-based way, using 2D views of a prototype face seen rotating in depth. The synthesized virtual views are evaluated as example views in a view-based approach to pose-invariant face recognition. They are shown to improve the recognition rate over the scenario where only the single real view is used.
Resumo:
Template matching by means of cross-correlation is common practice in pattern recognition. However, its sensitivity to deformations of the pattern and the broad and unsharp peaks it produces are significant drawbacks. This paper reviews some results on how these shortcomings can be removed. Several techniques (Matched Spatial Filters, Synthetic Discriminant Functions, Principal Components Projections and Reconstruction Residuals) are reviewed and compared on a common task: locating eyes in a database of faces. New variants are also proposed and compared: least squares Discriminant Functions and the combined use of projections on eigenfunctions and the corresponding reconstruction residuals. Finally, approximation networks are introduced in an attempt to improve filter design by the introduction of nonlinearity.
Resumo:
The development of increasingly sophisticated and powerful computers in the last few decades has frequently stimulated comparisons between them and the human brain. Such comparisons will become more earnest as computers are applied more and more to tasks formerly associated with essentially human activities and capabilities. The expectation of a coming generation of "intelligent" computers and robots with sensory, motor and even "intellectual" skills comparable in quality to (and quantitatively surpassing) our own is becoming more widespread and is, I believe, leading to a new and potentially productive analytical science of "information processing". In no field has this new approach been so precisely formulated and so thoroughly exemplified as in the field of vision. As the dominant sensory modality of man, vision is one of the major keys to our mastery of the environment, to our understanding and control of the objects which surround us. If we wish to created robots capable of performing complex manipulative tasks in a changing environment, we must surely endow them with (among other things) adequate visual powers. How can we set about designing such flexible and adaptive robots? In designing them, can we make use of our rapidly growing knowledge of the human brain, and if so, how at the same time, can our experiences in designing artificial vision systems help us to understand how the brain analyzes visual information?
Resumo:
This thesis introduces the Named-State Register File, a fine-grain, fully-associative register file. The NSF allows fast context switching between concurrent threads as well as efficient sequential program performance. The NSF holds more live data than conventional register files, and requires less spill and reload traffic to switch between contexts. This thesis demonstrates an implementation of the Named-State Register File and estimates the access time and chip area required for different organizations. Architectural simulations of large sequential and parallel applications show that the NSF can reduce execution time by 9% to 17% compared to alternative register files.
Resumo:
Kargl, Florian; Meyer, A.; Koza, M.M.; Schober, H., (2006) 'Formation of channels for fast-ion diffusion in alkali silicate melts: A quasielastic neutron scattering study', Physical Review B: Condensed Matter and Materials Physics 74 pp.14304 RAE2008
Resumo:
Oceanic bubble plumes caused by ship wakes or breaking waves disrupt sonar communi- cation because of the dramatic change in sound speed and attenuation in the bubbly fluid. Experiments in bubbly fluids have suffered from the inability to quantitatively characterize the fluid because of continuous air bubble motion. Conversely, single bubble experiments, where the bubble is trapped by a pressure field or stabilizing object, are limited in usable frequency range, apparatus complexity, or the invasive nature of the stabilizing object (wire, plate, etc.). Suspension of a bubble in a viscoelastic Xanthan gel allows acoustically forced oscilla- tions with negligible translation over a broad frequency band. Assuming only linear, radial motion, laser scattering from a bubble oscillating below, through, and above its resonance is measured. As the bubble dissolves in the gel, different bubble sizes are measured in the range 240 – 470 μm radius, corresponding to the frequency range 6 – 14 kHz. Equalization of the cell response in the raw data isolates the frequency response of the bubble. Compari- son to theory for a bubble in water shows good agreement between the predicted resonance frequency and damping, such that the bubble behaves as if it were oscillating in water.
Resumo:
Quadsim is an intermediate code simulator. It allows you to "run" programs that your compiler generates in intermediate code format. Its user interface is similar to most debuggers in that you can step through your program, instruction by instruction, set breakpoints, examine variable values, and so on. The intermediate code format used by Quadsim is that described in [Aho 86]. If your compiler generates intermediate code in this format, you will be able to take intermediate-code files generated by your compiler, load them into the simulator, and watch them "run." You are provided with functions that hide the internal representation of intermediate code. You can use these functions within your compiler to generate intermediate code files that can be read by the simulator. Quadsim was inspired and greatly influenced by [Aho 86]. The material in chapter 8 (Intermediate Code Generation) of [Aho 86] should be considered background material for users of Quadsim.
Resumo:
Bacteriophages, viruses infecting bacteria, are uniformly present in any location where there are high numbers of bacteria, both in the external environment and the human body. Knowledge of their diversity is limited by the difficulty to culture the host species and by the lack of the universal marker gene present in all viruses. Metagenomics is a powerful tool that can be used to analyse viral communities in their natural environments. The aim of this study was to investigate diverse populations of uncultured viruses from clinical (a sputum of patient with cystic fibrosis, CF) and environmental samples (a sludge from a dairy food wastewater treatment plant) containing rich bacterial populations using genetic and metagenomic analyses. Metagenomic sequencing of viruses obtained from these samples revealed that the majority of the metagenomic reads (97-99%) were novel when compared to the NCBI protein database using BLAST. A large proportion of assembled contigs were assignable as novel phages or uncharacterised prophages, the next largest assignable group being single-stranded eukaryotic virus genomes. Sputum from a cystic fibrosis patient contained DNA typical of phages of bacteria that are traditionally involved in CF lung infections and other bacteria that are part of the normal oral flora. The only eukaryotic virus detected in the CF sputum was Torque Teno virus (TTV). A substantial number of assigned sequences from dairy wastewater could be affiliated with phages of bacteria that are typically found in the soil and aquatic environments, including wastewater. Eukaryotic viral sequences were dominated by plant pathogens from the Geminiviridae and Nanoviridae families, and animal pathogens from the Circoviridae family. Antibiotic resistance genes were detected in both metagenomes suggesting phages could be a source for transmissible antimicrobial resistance. Overall, diversity of viruses in the CF sputum was low, with 89 distinct viral genotypes predicted, and higher (409 genotypes) in the wastewater. Function-based screening of a metagenomic library constructed from DNA extracted from dairy food wastewater viruses revealed candidate promoter sequences that have ability to drive expression of GFP in a promoter-trap vector in Escherichia coli. The majority of the cloned DNA sequences selected by the assay were related to ssDNA circular eukaryotic viruses and phages which formed a minority of the metagenome assembly, and many lacked any significant homology to known database sequences. Natural diversity of bacteriophages in wastewater samples was also examined by PCR amplification of the major capsid protein sequences, conserved within T4-type bacteriophages from Myoviridae family. Phylogenetic analysis of capsid sequences revealed that dairy wastewater contained mainly diverse and uncharacterized phages, while some showed a high level of similarity with phages from geographically distant environments.
Resumo:
We consider the problem of variable selection in regression modeling in high-dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study, and (2) the covariate space is highly structured, and in some cases it is desirable to incorporate this structural information in to the model building process. We approach this problem through the Bayesian variable selection framework, where we assume that the covariates lie on an undirected graph and formulate an Ising prior on the model space for incorporating structural information. Certain computational and statistical problems arise that are unique to such high-dimensional, structured settings, the most interesting being the phenomenon of phase transitions. We propose theoretical and computational schemes to mitigate these problems. We illustrate our methods on two different graph structures: the linear chain and the regular graph of degree k. Finally, we use our methods to study a specific application in genomics: the modeling of transcription factor binding sites in DNA sequences. © 2010 American Statistical Association.