899 resultados para object proposal
Resumo:
Pós-graduação em Televisão Digital: Informação e Conhecimento - FAAC
Resumo:
Object recognition has long been a core problem in computer vision. To improve object spatial support and speed up object localization for object recognition, generating high-quality category-independent object proposals as the input for object recognition system has drawn attention recently. Given an image, we generate a limited number of high-quality and category-independent object proposals in advance and used as inputs for many computer vision tasks. We present an efficient dictionary-based model for image classification task. We further extend the work to a discriminative dictionary learning method for tensor sparse coding. In the first part, a multi-scale greedy-based object proposal generation approach is presented. Based on the multi-scale nature of objects in images, our approach is built on top of a hierarchical segmentation. We first identify the representative and diverse exemplar clusters within each scale. Object proposals are obtained by selecting a subset from the multi-scale segment pool via maximizing a submodular objective function, which consists of a weighted coverage term, a single-scale diversity term and a multi-scale reward term. The weighted coverage term forces the selected set of object proposals to be representative and compact; the single-scale diversity term encourages choosing segments from different exemplar clusters so that they will cover as many object patterns as possible; the multi-scale reward term encourages the selected proposals to be discriminative and selected from multiple layers generated by the hierarchical image segmentation. The experimental results on the Berkeley Segmentation Dataset and PASCAL VOC2012 segmentation dataset demonstrate the accuracy and efficiency of our object proposal model. Additionally, we validate our object proposals in simultaneous segmentation and detection and outperform the state-of-art performance. To classify the object in the image, we design a discriminative, structural low-rank framework for image classification. We use a supervised learning method to construct a discriminative and reconstructive dictionary. By introducing an ideal regularization term, we perform low-rank matrix recovery for contaminated training data from all categories simultaneously without losing structural information. A discriminative low-rank representation for images with respect to the constructed dictionary is obtained. With semantic structure information and strong identification capability, this representation is good for classification tasks even using a simple linear multi-classifier.
Resumo:
Object. The goal of this paper is to analyze the extension and relationships of glomus jugulare tumor with the temporal bone and the results of its surgical treatment aiming at preservation of the facial nerve. Based on the tumor extension and its relationships with the facial nerve, new criteria to be used in the selection of different surgical approaches are proposed. Methods. Between December 1997 and December 2007, 34 patients (22 female and 12 male) with glomus jugulare tumors were treated. Their mean age was 48 years. The mean follow-up was 52.5 months. Clinical findings included hearing loss in 88%, swallowing disturbance in 50%, and facial nerve palsy in 41%. Magnetic resonance imaging demonstrated a mass in the jugular foramen in all cases, a mass in the middle ear in 97%, a cervical mass in 85%, and an intradural mass in 41%. The tumor was supplied by the external carotid artery in all cases, the internal carotid artery in 44%, and the vertebral artery in 32%. Preoperative embolization was performed in 15 cases. The approach was tailored to each patient, and 4 types of approaches were designed. The infralabyrinthine retrofacial approach (Type A) was used in 32.5%; infralabyrinthine pre- and retrofacial approach without occlusion of the external acoustic meatus (Type B) in 20.5%; infralabyrinthine pre- and retrofacial approach with occlusion of the external acoustic meatus (Type C) in 41 W. and the infralabyrinthine approach with transposition of the facial nerve and removal of the middle ear structures (Type D) in 6% of the patients. Results. Radical removal was achieved in 91% of the cases and partial removal in 9%. Among 20 patients without preoperative facial nerve dysfunction, the nerve was kept in anatomical position in 19 (95%), and facial nerve function was normal during the immediate postoperative period in 17 (85%). Six patients (17.6%) had a new lower cranial nerve deficit, but recovery of swallowing function was adequate in all cases. Voice disturbance remained in all 6 cases. Cerebrospinal fluid leakage occurred in 6 patients (17.6%), with no need for reoperation in any of them. One patient died in the postoperative period due to pulmonary complications. The global recovery, based on the Karnofsky Performance Scale (KPS), was 100% in 15% of the patients, 90% in 45%, 80% in 33%, and 70% in 6%. Conclusions. Radical removal of glomus jugulare tumor can be achieved without anterior transposition of the facial nerve. The extension of dissection, however, should be tailored to each case based on tumor blood supply, preoperative symptoms, and tumor extension. The operative field provided by the retrofacial infralabyrinthine approach, or the pre- and retrofacial approaches. with or without Closure of the external acoustic meatus, allows a wide exposure of the jugular foramen area. Global functional recovery based on the KPS is acceptable in 94% of the patients. (DOI: 10.3171/2008.10.JNS08612)
Resumo:
This paper describes how MPEG-4 object based video (obv) can be used to allow selected objects to be inserted into the play-out stream to a specific user based on a profile derived for that user. The application scenario described here is for personalized product placement, and considers the value of this application in the current and evolving commercial media distribution market given the huge emphasis media distributors are currently placing on targeted advertising. This level of application of video content requires a sophisticated content description and metadata system (e.g., MPEG-7). The scenario considers the requirement for global libraries to provide the objects to be inserted into the streams. The paper then considers the commercial trading of objects between the libraries, video service providers, advertising agencies and other parties involved in the service. Consequently a brokerage of video objects is proposed based on negotiation and trading using intelligent agents representing the various parties. The proposed Media Brokerage Platform is a multi-agent system structured in two layers. In the top layer, there is a collection of coarse grain agents representing the real world players – the providers and deliverers of media contents and the market regulator profiler – and, in the bottom layer, there is a set of finer grain agents constituting the marketplace – the delegate agents and the market agent. For knowledge representation (domain, strategic and negotiation protocols) we propose a Semantic Web approach based on ontologies. The media components contents should be represented in MPEG-7 and the metadata describing the objects to be traded should follow a specific ontology. The top layer content providers and deliverers are modelled by intelligent autonomous agents that express their will to transact – buy or sell – media components by registering at a service registry. The market regulator profiler creates, according to the selected profile, a market agent, which, in turn, checks the service registry for potential trading partners for a given component and invites them for the marketplace. The subsequent negotiation and actual transaction is performed by delegate agents in accordance with their profiles and the predefined rules of the market.
Resumo:
Given a set of images of scenes containing different object categories (e.g. grass, roads) our objective is to discover these objects in each image, and to use this object occurrences to perform a scene classification (e.g. beach scene, mountain scene). We achieve this by using a supervised learning algorithm able to learn with few images to facilitate the user task. We use a probabilistic model to recognise the objects and further we classify the scene based on their object occurrences. Experimental results are shown and evaluated to prove the validity of our proposal. Object recognition performance is compared to the approaches of He et al. (2004) and Marti et al. (2001) using their own datasets. Furthermore an unsupervised method is implemented in order to evaluate the advantages and disadvantages of our supervised classification approach versus an unsupervised one
Resumo:
Objective Developing an instrument to evaluate the performance of primary health care in the leprosy control actions, from the perspective of users and do the face and content validation. Method This is a methodological study carried out in four stages: development of instrument, face and content validation, pre-test, and analysis of test-retest reliability. Results The initial instrument submitted to the judgment of 15 experts was composed of 157 items. The face and content validation and pre-test of instrument were essential for the exclusion of items and adjustment of instrument to evaluate the object under study. In the analysis of test-retest reliability, the instrument proved to be reliable. Conclusion The instrument is considered adequate, but further studies are needed to test the psychometric properties.
Resumo:
Statistical computing when input/output is driven by a Graphical User Interface is considered. A proposal is made for automatic control ofcomputational flow to ensure that only strictly required computationsare actually carried on. The computational flow is modeled by a directed graph for implementation in any object-oriented programming language with symbolic manipulation capabilities. A complete implementation example is presented to compute and display frequency based piecewise linear density estimators such as histograms or frequency polygons.
Resumo:
In this paper we describe a proposal for defining the relationships between resources, users and services in a digital repository. Nowadays, virtual learning environments are widely used but digital repositories are not fully integrated yet into the learning process. Our final goal is to provide final users with recommendation systems and reputation schemes that help them to build a true learning community around the institutional repository, taking into account their educational context (i.e. the courses they are enrolled into) and their activity (i.e. system usage by their classmates and teachers). In order to do so, we extend the basic resource concept in a traditional digital repository by adding all the educational context and other elements from end-users' profiles, thus bridging users, resources and services, and shifting from a library-centered paradigm to a learning-centered one.
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
Resumo:
A persistent issue of debate in the area of 3D object recognition concerns the nature of the experientially acquired object models in the primate visual system. One prominent proposal in this regard has expounded the use of object centered models, such as representations of the objects' 3D structures in a coordinate frame independent of the viewing parameters [Marr and Nishihara, 1978]. In contrast to this is another proposal which suggests that the viewing parameters encountered during the learning phase might be inextricably linked to subsequent performance on a recognition task [Tarr and Pinker, 1989; Poggio and Edelman, 1990]. The 'object model', according to this idea, is simply a collection of the sample views encountered during training. Given that object centered recognition strategies have the attractive feature of leading to viewpoint independence, they have garnered much of the research effort in the field of computational vision. Furthermore, since human recognition performance seems remarkably robust in the face of imaging variations [Ellis et al., 1989], it has often been implicitly assumed that the visual system employs an object centered strategy. In the present study we examine this assumption more closely. Our experimental results with a class of novel 3D structures strongly suggest the use of a view-based strategy by the human visual system even when it has the opportunity of constructing and using object-centered models. In fact, for our chosen class of objects, the results seem to support a stronger claim: 3D object recognition is 2D view-based.
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
Resumo:
Given a set of images of scenes containing different object categories (e.g. grass, roads) our objective is to discover these objects in each image, and to use this object occurrences to perform a scene classification (e.g. beach scene, mountain scene). We achieve this by using a supervised learning algorithm able to learn with few images to facilitate the user task. We use a probabilistic model to recognise the objects and further we classify the scene based on their object occurrences. Experimental results are shown and evaluated to prove the validity of our proposal. Object recognition performance is compared to the approaches of He et al. (2004) and Marti et al. (2001) using their own datasets. Furthermore an unsupervised method is implemented in order to evaluate the advantages and disadvantages of our supervised classification approach versus an unsupervised one
Resumo:
Visual working memory (VWM) involves maintaining and processing visual information, often for the purpose of making immediate decisions. Neuroimaging experiments of VWM provide evidence in support of a neural system mainly involving a fronto-parietal neuronal network, but the role of specific brain areas is less clear. A proposal that has recently generated considerable debate suggests that a dissociation of object and location VWM occurs within the prefrontal cortex, in dorsal and ventral regions, respectively. However, re-examination of the relevant literature presents a more robust distribution suggestive of a general caudal-rostral dissociation from occipital and parietal structures, caudally, to prefrontal regions, rostrally, corresponding to location and object memory, respectively. The purpose of the present study was to identify a dissociation of location and object VWM across two imaging methods (magnetoencephalography, MEG, and functional magnetic imaging, fMRI). These two techniques provide complimentary results due the high temporal resolution of MEG and the high spatial resolution of fMRI. The use of identical location and object change detection tasks was employed across techniques and reported for the first time. Moreover, this study is the first to use matched stimulus displays across location and object VWM conditions. The results from these two imaging methods provided convergent evidence of a location and object VWM dissociation favoring a general caudal-rostral rather than the more common prefrontal dorsal-ventral view. Moreover, neural activity across techniques was correlated with behavioral performance for the first time and provided convergent results. This novel approach of combining imaging tools to study memory resulted in robust evidence suggesting a novel interpretation of location and object memory. Accordingly, this study presents a novel context within which to explore the neural substrates of WM across imaging techniques and populations.
Resumo:
Subpixel methods increase the accuracy and efficiency of image detectors, processing units, and algorithms and provide very cost-effective systems for object tracking. Published methods achieve resolution increases up to three orders of magnitude. In this Letter, we demonstrate that this limit can be theoretically improved by several orders of magnitude, permitting micropixel and submicropixel accuracies. The necessary condition for movement detection is that one single pixel changes its status. We show that an appropriate target design increases the probability of a pixel change for arbitrarily small shifts, thus increasing the detection accuracy of a tracking system. The proposal does not impose severe restriction on the target nor on the sensor, thus allowing easy experimental implementation.