988 resultados para object classification
Resumo:
In the present paper a general analytic expression has been obtained and confirmed by a computer simulation which links the surface roughness of an object under study in an emission electron microscope and it's resolution. A quantitative derivation was made for the model case when there is a step on the object surface. It was shown that the resolution is deteriorated asymmetrically relative to the step. The effect sets a practical limit to the ultimate lateral resolution obtainable in an emission electron microscope.
Resumo:
The time-courses of orthographic, phonological and semantic processing of Chinese characters were investigated systematically with multi-channel event-related potentials (ERPs). New evidences concerning whether phonology or semantics is processed first and whether phonology mediates semantic access were obtained, supporting and developing the new concept of repetition, overlapping, and alternating processing in Chinese character recognition. Statistic parameter mapping based on physiological double dissociation has been developed. Seven experiments were conducted: I) deciding which type of structure, left-right or non-left-right, the character displayed on the screen was; 2) deciding whether or not there was a vowel/a/in the pronunciation of the character; 3) deciding which classification, natural object or non-natural object, the character was; 4) deciding which color, red or green, the character was; 5) deciding which color, red or green, the non-character was; 6) fixing on the non-character; 7) fixing on the crosslet. The main results are: 1. N240 and P240:N240 and P240 localized at occipital and prefrontal respectively were found in experiments 1, 2, 3, and 4, but not in experiments 5, 6, or 7. The difference between the former 4 and the latter 3 experiments was only their stimuli: the former's were true Chinese characters while the latter's were non-characters or crosslet. Thus Chinese characters were related to these two components, which reflected unique processing of Chinese characters peaking at about 240 msec. 2. Basic visual feature analysis: In comparison with experiment 7 there was a common cognitive process in experiments 1, 2, 4, and 6 - basic visual feature analysis. The corresponding ERP amplitude increase in most sites started from about 60 msec. 3. Orthography: The ERP differences located at the main processing area of orthography (occipital) between experiments 1, 2, 3, 4 and experiment 5 started from about 130 msec. This was the category difference between Chinese characters and non-characters, which revealed that orthographic processing started from about 130 msec. The ERP differences between the experiments 1, 2, 3 and the experiment 4 occurred in 210-250, 230-240, and 190-250 msec respectively, suggesting orthography was processed again. These were the differences between language and non-language tasks, which revealed a higher level processing than that in the above mentioned 130 msec. All the phenomena imply that the orthographic processing does not finished in one time of processing; the second time of processing is not a simple repetition, but a higher level one. 4. Phonology: The ERPs of experiment 2 (phonological task) were significantly stronger than those of experiment 3 (semantic task) at the main processing areas of phonology (temporal and left prefrontal) starting from about 270 msec, which revealed phonologic processing. The ERP differences at left frontal between experiment 2 and experiment 1 (orthographic task) started from about 250 msec. When comparing phonological task with experiment 4 (character color decision), the ERP differences at left temporal and prefrontal started from about 220 msec. Thus phonological processing may start before 220 msec. 5. Semantic: The ERPs of experiment 3 (semantic task) were significantly stronger than those of experiment 2 (phonological task) at the main processing areas of semantics (parietal and occipital) starting from about 290 msec, which revealed semantic processing. The ERP differences at these areas between experiment 3 and experiment 4 (character color decision) started from about 270 msec. The ERP differences between experiment 3 and experiment 1 (orthographic task) started from about 260 msec. Thus semantic processing may start before 260 msec. 6. Overlapping of phonological and semantic processing: From about 270 to 350 msec, the ERPs of experiment 2 (phonological task) were significantly larger than those of experiment 3 (semantic task) at the main processing areas of phonology (temporal and left prefrontal); while from about 290-360 msec, the ERPs of experiment 3 were significantly larger than those of experiment 2 at the main processing areas of semantics (frontal, parietal, and occipital). Thus phonological processing may start earlier than semantic and their time-courses may alternate, which reveals parallel processing. 7. Semantic processing needs part phonology: When experiment 1 (orthographic task) served as baseline, the ERPs of experiment 2 and 3 (phonological and semantic tasks) significantly increased at the main processing areas of phonology (left temporal and frontal) starting from about 250 msec. The ERPs of experiment 3, besides, increased significantly at the main processing areas of semantics (parietal and frontal) starting from about 260 msec. When experiment 4 (character color decision) served as baseline, the ERPs of experiment 2 and 3 significantly increased at phonological areas (left temporal and frontal) starting from about 220 msec. The ERPs of experiment 3, similarly, increased significantly at semantic areas (parietal and frontal) starting from about270 msec. Hence, before semantic processing, a part of phonological information may be required. The conclusion could be got from above results in the present experimental conditions: 1. The basic visual feature processing starts from about 60 msec; 2. Orthographic processing starts from about 130 msec, and repeats at about 240 msec. The second processing is not simple repetition of the first one, but a higher level processing; 3. Phonological processing begins earlier than semantic, and their time-courses overlap; 4. Before semantic processing, a part of phonological information may be required; 5. The repetition, overlapping, and alternating of the orthographic, phonological and semantic processing of Chinese characters could exist in cognition. Thus the problem of whether phonology mediates semantics access is not a simple, but a complicated issue.
Resumo:
Three experiments were conducted in the attempt to understand the development of hierarchical classification. The 3-and 4-year-olds as well as hindergartners were given matching or object sorting tasks with either basic-level or superordinate relations. The results indicated: 1. Even 3-year-olds can consistently sort at the basic level. However, children perform poorly at the superordinate level, and there are developmental differences in the ability to sort at this level. 2. The perceptual similarity of stimulus materials, various competing organizations and the different ways of dealing with superordinate categories, depending on whether the categories are explicitly or implicitly represented for the child, are all factors appear to contribute to children's hierarchical classification.
Resumo:
We consider the problem of matching model and sensory data features in the presence of geometric uncertainty, for the purpose of object localization and identification. The problem is to construct sets of model feature and sensory data feature pairs that are geometrically consistent given that there is uncertainty in the geometry of the sensory data features. If there is no geometric uncertainty, polynomial-time algorithms are possible for feature matching, yet these approaches can fail when there is uncertainty in the geometry of data features. Existing matching and recognition techniques which account for the geometric uncertainty in features either cannot guarantee finding a correct solution, or can construct geometrically consistent sets of feature pairs yet have worst case exponential complexity in terms of the number of features. The major new contribution of this work is to demonstrate a polynomial-time algorithm for constructing sets of geometrically consistent feature pairs given uncertainty in the geometry of the data features. We show that under a certain model of geometric uncertainty the feature matching problem in the presence of uncertainty is of polynomial complexity. This has important theoretical implications by demonstrating an upper bound on the complexity of the matching problem, an by offering insight into the nature of the matching problem itself. These insights prove useful in the solution to the matching problem in higher dimensional cases as well, such as matching three-dimensional models to either two or three-dimensional sensory data. The approach is based on an analysis of the space of feasible transformation parameters. This paper outlines the mathematical basis for the method, and describes the implementation of an algorithm for the procedure. Experiments demonstrating the method are reported.
Resumo:
Many current recognition systems use constrained search to locate objects in cluttered environments. Previous formal analysis has shown that the expected amount of search is quadratic in the number of model and data features, if all the data is known to come from a sinlge object, but is exponential when spurious data is included. If one can group the data into subsets likely to have come from a single object, then terminating the search once a "good enough" interpretation is found reduces the expected search to cubic. Without successful grouping, terminated search is still exponential. These results apply to finding instances of a known object in the data. In this paper, we turn to the problem of selecting models from a library, and examine the combinatorics of determining that a candidate object is not present in the data. We show that the expected search is again exponential, implying that naﶥ approaches to indexing are likely to carry an expensive overhead, since an exponential amount of work is needed to week out each of the incorrect models. The analytic results are shown to be in agreement with empirical data for cluttered object recognition.
Resumo:
We report a series of psychophysical experiments that explore different aspects of the problem of object representation and recognition in human vision. Contrary to the paradigmatic view which holds that the representations are three-dimensional and object-centered, the results consistently support the notion of view-specific representations that include at most partial depth information. In simulated experiments that involved the same stimuli shown to the human subjects, computational models built around two-dimensional multiple-view representations replicated our main psychophysical results, including patterns of generalization errors and the time course of perceptual learning.
Resumo:
In this paper we present some extensions to the k-means algorithm for vector quantization that permit its efficient use in image segmentation and pattern classification tasks. It is shown that by introducing state variables that correspond to certain statistics of the dynamic behavior of the algorithm, it is possible to find the representative centers fo the lower dimensional maniforlds that define the boundaries between classes, for clouds of multi-dimensional, mult-class data; this permits one, for example, to find class boundaries directly from sparse data (e.g., in image segmentation tasks) or to efficiently place centers for pattern classification (e.g., with local Gaussian classifiers). The same state variables can be used to define algorithms for determining adaptively the optimal number of centers for clouds of data with space-varying density. Some examples of the applicatin of these extensions are also given.
Resumo:
In order to recognize an object in an image, we must determine the best transformation from object model to the image. In this paper, we show that for features from coplanar surfaces which undergo linear transformations in space, there exist projections invariant to the surface motions up to rotations in the image field. To use this property, we propose a new alignment approach to object recognition based on centroid alignment of corresponding feature groups. This method uses only a single pair of 2D model and data. Experimental results show the robustness of the proposed method against perturbations of feature positions.
Resumo:
This paper describes the main features of a view-based model of object recognition. The model tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network in which each of the hidden units is broadly tuned to a specific view of the object to be recognized.
Resumo:
How does the brain recognize three-dimensional objects? We trained monkeys to recognize computer rendered objects presented from an arbitrarily chosen training view, and subsequently tested their ability to generalize recognition for other views. Our results provide additional evidence in favor of with a recognition model that accomplishes view-invariant performance by storing a limited number of object views or templates together with the capacity to interpolate between the templates (Poggio and Edelman, 1990).
Resumo:
The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced techniques that are applicable under restricted conditions but simpler. The approach exploits image transformations that are specific to the relevant object class and learnable from example views of other "prototypical" objects of the same class. In this paper, we introduce such a new technique by extending the notion of linear class first proposed by Poggio and Vetter. For linear object classes it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively "rotate" high- resolution face images from a single 2D view.