996 resultados para visual segmentation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stimuli outside classical receptive fields significantly influence the neurons' activities in primary visual cortex. We propose that such contextual influences are used to segment regions by detecting the breakdown of homogeneity or translation invariance in the input, thus computing global region boundaries using local interactions. This is implemented in a biologically based model of V1, and demonstrated in examples of texture segmentation and figure-ground segregation. By contrast with traditional approaches, segmentation occurs without classification or comparison of features within or between regions and is performed by exactly the same neural circuit responsible for the dual problem of the grouping and enhancement of contours.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Stimuli outside classical receptive fields have been shown to exert significant influence over the activities of neurons in primary visual cortexWe propose that contextual influences are used for pre-attentive visual segmentation, in a new framework called segmentation without classification. This means that segmentation of an image into regions occurs without classification of features within a region or comparison of features between regions. This segmentation framework is simpler than previous computational approaches, making it implementable by V1 mechanisms, though higher leve l visual mechanisms are needed to refine its output. However, it easily handles a class of segmentation problems that are tricky in conventional methods. The cortex computes global region boundaries by detecting the breakdown of homogeneity or translation invariance in the input, using local intra-cortical interactions mediated by the horizontal connections. The difference between contextual influences near and far from region boundaries makes neural activities near region boundaries higher than elsewhere, making boundaries more salient for perceptual pop-out. This proposal is implemented in a biologically based model of V1, and demonstrated using examples of texture segmentation and figure-ground segregation. The model performs segmentation in exactly the same neural circuit that solves the dual problem of the enhancement of contours, as is suggested by experimental observations. Its behavior is compared with psychophysical and physiological data on segmentation, contour enhancement, and contextual influences. We discuss the implications of segmentation without classification and the predictions of our V1 model, and relate it to other phenomena such as asymmetry in visual search.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Biomédica

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cells in adult primary visual cortex are capable of integrating information over much larger portions of the visual field than was originally thought. Moreover, their receptive field properties can be altered by the context within which local features are presented and by changes in visual experience. The substrate for both spatial integration and cortical plasticity is likely to be found in a plexus of long-range horizontal connections, formed by cortical pyramidal cells, which link cells within each cortical area over distances of 6-8 mm. The relationship between horizontal connections and cortical functional architecture suggests a role in visual segmentation and spatial integration. The distribution of lateral interactions within striate cortex was visualized with optical recording, and their functional consequences were explored by using comparable stimuli in human psychophysical experiments and in recordings from alert monkeys. They may represent the substrate for perceptual phenomena such as illusory contours, surface fill-in, and contour saliency. The dynamic nature of receptive field properties and cortical architecture has been seen over time scales ranging from seconds to months. One can induce a remapping of the topography of visual cortex by making focal binocular retinal lesions. Shorter-term plasticity of cortical receptive fields was observed following brief periods of visual stimulation. The mechanisms involved entailed, for the short-term changes, altering the effectiveness of existing cortical connections, and for the long-term changes, sprouting of axon collaterals and synaptogenesis. The mutability of cortical function implies a continual process of calibration and normalization of the perception of visual attributes that is dependent on sensory experience throughout adulthood and might further represent the mechanism of perceptual learning.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nowadays robotic applications are widespread and most of the manipulation tasks are efficiently solved. However, Deformable-Objects (DOs) still represent a huge limitation for robots. The main difficulty in DOs manipulation is dealing with the shape and dynamics uncertainties, which prevents the use of model-based approaches (since they are excessively computationally complex) and makes sensory data difficult to interpret. This thesis reports the research activities aimed to address some applications in robotic manipulation and sensing of Deformable-Linear-Objects (DLOs), with particular focus to electric wires. In all the works, a significant effort was made in the study of an effective strategy for analyzing sensory signals with various machine learning algorithms. In the former part of the document, the main focus concerns the wire terminals, i.e. detection, grasping, and insertion. First, a pipeline that integrates vision and tactile sensing is developed, then further improvements are proposed for each module. A novel procedure is proposed to gather and label massive amounts of training images for object detection with minimal human intervention. Together with this strategy, we extend a generic object detector based on Convolutional-Neural-Networks for orientation prediction. The insertion task is also extended by developing a closed-loop control capable to guide the insertion of a longer and curved segment of wire through a hole, where the contact forces are estimated by means of a Recurrent-Neural-Network. In the latter part of the thesis, the interest shifts to the DLO shape. Robotic reshaping of a DLO is addressed by means of a sequence of pick-and-place primitives, while a decision making process driven by visual data learns the optimal grasping locations exploiting Deep Q-learning and finds the best releasing point. The success of the solution leverages on a reliable interpretation of the DLO shape. For this reason, further developments are made on the visual segmentation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Ett ämne som väckt intresse både inom industrin och forskningen är hantering av kundförhållanden (CRM, eng. Customer Relationship Management), dvs. en kundorienterad affärsstrategi där företagen från att ha varit produktorienterade väljer att bli mera kundcentrerade. Numera kan kundernas beteende och aktiviteter lätt registreras och sparas med hjälp av integrerade affärssystem (ERP, eng. Enterprise Resource Planning) och datalager (DW, eng. Data Warehousing). Kunder med olika preferenser och köpbeteende skapar sin egen ”signatur” i synnerhet via användningen av kundkort, vilket möjliggör mångsidig modellering av kundernas köpbeteende. För att få en översikt av kundernas köpbeteende och deras lönsamhet, används ofta kundsegmentering som en metod för att indela kunderna i grupper utgående från deras likheter. De mest använda metoderna för kundsegmentering är analytiska modeller konstruerade för en viss tidsperiod. Dessa modeller beaktar inte att kundernas beteende kan förändras med tiden. I föreliggande avhandling skapas en holistisk översikt av kundernas karaktär och köpbeteende som utöver de konventionella segmenteringsmodellerna även beaktar dynamiken i köpbeteendet. Dynamiken i en kundsegmenteringsmodell innefattar förändringar i segmentens struktur och innehåll, samt förändringen av individuella kunders tillhörighet i ett segment (s.k migrationsanalyser). Vardera förändringen modelleras, analyseras och exemplifieras med visuella datautvinningstekniker, främst med självorganiserande kartor (SOM, eng. Self-Organizing Maps) och självorganiserande tidskartor (SOTM), en vidareutveckling av SOM. Visualiseringen anteciperas underlätta tolkningen av identifierade mönster och göra processen med kunskapsöverföring mellan den som gör analysen och beslutsfattaren smidigare. Asiakkuudenhallinta (CRM) eli organisaation muuttaminen tuotepainotteisesta asiakaskeskeiseksi on herättänyt mielenkiintoa niin yliopisto- kuin yritysmaailmassakin. Asiakkaiden käyttäytymistä ja toimintaa pystytään nykyään helposti tallentamaan ja varastoimaan toiminnanohjausjärjestelmien ja tietovarastojen avulla; asiakkaat jättävät jatkuvasti piirteistään ja ostokäyttäytymisestään kertovia tietojälkiä, joita voidaan analysoida. On tavallista, että asiakkaat poikkeavat toisistaan eri tavoin, ja heidän mieltymyksensä kuten myös ostokäyttäytymisensä saattavat olla hyvinkin erilaisia. Asiakaskäyttäytymisen monimuotoisuuteen ja tuottavuuteen paneuduttaessa käytetäänkin laajalti asiakassegmentointia eli asiakkaiden jakamista ryhmiin samankaltaisuuden perusteella. Perinteiset asiakassegmentoinnin ratkaisut ovat usein yksittäisiä analyyttisia malleja, jotka on tehty tietyn aikajakson perusteella. Tämän vuoksi ne monesti jättävät huomioimatta sen, että asiakkaiden käyttäytyminen saattaa ajan kuluessa muuttua. Tässä väitöskirjassa pyritäänkin tarjoamaan holistinen kuva asiakkaiden ominaisuuksista ja ostokäyttäytymisestä tarkastelemalla kahta muutosvoimaa tiettyyn aikarajaukseen perustuvien perinteisten segmentointimallien lisäksi. Nämä kaksi asiakassegmentointimallin dynamiikkaa ovat muutokset segmenttien rakenteessa ja muutokset yksittäisten asiakkaiden kuulumisessa ryhmään. Ensimmäistä dynamiikkaa lähestytään ajallisen asiakassegmentoinnin avulla, jossa visualisoidaan ajan kuluessa tapahtuvat muutokset segmenttien rakenteissa ja profiileissa. Toista dynamiikkaa taas lähestytään käyttäen nk. segmenttisiirtymien analyysia, jossa visuaalisin keinoin tunnistetaan samantyyppisesti segmentistä toiseen vaihtavat asiakkaat. Visualisoinnin tehtävänä on tukea havaittujen kaavojen tulkitsemista sekä helpottaa tiedonsiirtoa analysoijan ja päättäjien välillä. Visuaalisia tiedonlouhintamenetelmiä, kuten itseorganisoivia karttoja ja niiden laajennuksia, käytetään osoittamaan näiden menetelmien hyödyllisyys sekä asiakkuudenhallinnassa yleisesti että erityisesti asiakassegmentoinnissa.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Companies require information in order to gain an improved understanding of their customers. Data concerning customers, their interests and behavior are collected through different loyalty programs. The amount of data stored in company data bases has increased exponentially over the years and become difficult to handle. This research area is the subject of much current interest, not only in academia but also in practice, as is shown by several magazines and blogs that are covering topics on how to get to know your customers, Big Data, information visualization, and data warehousing. In this Ph.D. thesis, the Self-Organizing Map and two extensions of it – the Weighted Self-Organizing Map (WSOM) and the Self-Organizing Time Map (SOTM) – are used as data mining methods for extracting information from large amounts of customer data. The thesis focuses on how data mining methods can be used to model and analyze customer data in order to gain an overview of the customer base, as well as, for analyzing niche-markets. The thesis uses real world customer data to create models for customer profiling. Evaluation of the built models is performed by CRM experts from the retailing industry. The experts considered the information gained with help of the models to be valuable and useful for decision making and for making strategic planning for the future.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Advancements in information technology have made it possible for organizations to gather and store vast amounts of data of their customers. Information stored in databases can be highly valuable for organizations. However, analyzing large databases has proven to be difficult in practice. For companies in the retail industry, customer intelligence can be used to identify profitable customers, their characteristics, and behavior. By clustering customers into homogeneous groups, companies can more effectively manage their customer base and target profitable customer segments. This thesis will study the use of the self-organizing map (SOM) as a method for analyzing large customer datasets, clustering customers, and discovering information about customer behavior. Aim of the thesis is to find out whether the SOM could be a practical tool for retail companies to analyze their customer data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Speech is typically a multimodal phenomenon, yet few studies have focused on the exclusive contributions of visual cues to language acquisition. To address this gap, we investigated whether visual prosodic information can facilitate speech segmentation. Previous research has demonstrated that language learners can use lexical stress and pitch cues to segment speech and that learners can extract this information from talking faces. Thus, we created an artificial speech stream that contained minimal segmentation cues and paired it with two synchronous facial displays in which visual prosody was either informative or uninformative for identifying word boundaries. Across three familiarisation conditions (audio stream alone, facial streams alone, and paired audiovisual), learning occurred only when the facial displays were informative to word boundaries, suggesting that facial cues can help learners solve the early challenges of language acquisition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Extracting human postural information from video sequences has proved a difficult research question. The most successful approaches to date have been based on particle filtering, whereby the underlying probability distribution is approximated by a set of particles. The shape of the underlying observational probability distribution plays a significant role in determining the success, both accuracy and efficiency, of any visual tracker. In this paper we compare approaches used by other authors and present a cost path approach which is commonly used in image segmentation problems, however is currently not widely used in tracking applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation. The study of human brain development in itsearly stage is today possible thanks to in vivo fetalmagnetic resonance imaging (MRI) techniques. Aquantitative analysis of fetal cortical surfacerepresents a new approach which can be used as a markerof the cerebral maturation (as gyration) and also forstudying central nervous system pathologies [1]. However,this quantitative approach is a major challenge forseveral reasons. First, movement of the fetus inside theamniotic cavity requires very fast MRI sequences tominimize motion artifacts, resulting in a poor spatialresolution and/or lower SNR. Second, due to the ongoingmyelination and cortical maturation, the appearance ofthe developing brain differs very much from thehomogenous tissue types found in adults. Third, due tolow resolution, fetal MR images considerably suffer ofpartial volume (PV) effect, sometimes in large areas.Today extensive efforts are made to deal with thereconstruction of high resolution 3D fetal volumes[2,3,4] to cope with intra-volume motion and low SNR.However, few studies exist related to the automatedsegmentation of MR fetal imaging. [5] and [6] work on thesegmentation of specific areas of the fetal brain such asposterior fossa, brainstem or germinal matrix. Firstattempt for automated brain tissue segmentation has beenpresented in [7] and in our previous work [8]. Bothmethods apply the Expectation-Maximization Markov RandomField (EM-MRF) framework but contrary to [7] we do notneed from any anatomical atlas prior. Data set &Methods. Prenatal MR imaging was performed with a 1-Tsystem (GE Medical Systems, Milwaukee) using single shotfast spin echo (ssFSE) sequences (TR 7000 ms, TE 180 ms,FOV 40 x 40 cm, slice thickness 5.4mm, in plane spatialresolution 1.09mm). Each fetus has 6 axial volumes(around 15 slices per volume), each of them acquired inabout 1 min. Each volume is shifted by 1 mm with respectto the previous one. Gestational age (GA) ranges from 29to 32 weeks. Mother is under sedation. Each volume ismanually segmented to extract fetal brain fromsurrounding maternal tissues. Then, in-homogeneityintensity correction is performed using [9] and linearintensity normalization is performed to have intensityvalues that range from 0 to 255. Note that due tointra-tissue variability of developing brain someintensity variability still remains. For each fetus, ahigh spatial resolution image of isotropic voxel size of1.09 mm is created applying [2] and using B-splines forthe scattered data interpolation [10] (see Fig. 1). Then,basal ganglia (BS) segmentation is performed on thissuper reconstructed volume. Active contour framework witha Level Set (LS) implementation is used. Our LS follows aslightly different formulation from well-known Chan-Vese[11] formulation. In our case, the LS evolves forcing themean of the inside of the curve to be the mean intensityof basal ganglia. Moreover, we add local spatial priorthrough a probabilistic map created by fitting anellipsoid onto the basal ganglia region. Some userinteraction is needed to set the mean intensity of BG(green dots in Fig. 2) and the initial fitting points forthe probabilistic prior map (blue points in Fig. 2). Oncebasal ganglia are removed from the image, brain tissuesegmentation is performed as described in [8]. Results.The case study presented here has 29 weeks of GA. Thehigh resolution reconstructed volume is presented in Fig.1. The steps of BG segmentation are shown in Fig. 2.Overlap in comparison with manual segmentation isquantified by the Dice similarity index (DSI) equal to0.829 (values above 0.7 are considered a very goodagreement). Such BG segmentation has been applied on 3other subjects ranging for 29 to 32 GA and the DSI hasbeen of 0.856, 0.794 and 0.785. Our segmentation of theinner (red and blue contours) and outer cortical surface(green contour) is presented in Fig. 3. Finally, torefine the results we include our WM segmentation in theFreesurfer software [12] and some manual corrections toobtain Fig.4. Discussion. Precise cortical surfaceextraction of fetal brain is needed for quantitativestudies of early human brain development. Our workcombines the well known statistical classificationframework with the active contour segmentation forcentral gray mater extraction. A main advantage of thepresented procedure for fetal brain surface extraction isthat we do not include any spatial prior coming fromanatomical atlases. The results presented here arepreliminary but promising. Our efforts are now in testingsuch approach on a wider range of gestational ages thatwe will include in the final version of this work andstudying as well its generalization to different scannersand different type of MRI sequences. References. [1]Guibaud, Prenatal Diagnosis 29(4) (2009). [2] Rousseau,Acad. Rad. 13(9), 2006, [3] Jiang, IEEE TMI 2007. [4]Warfield IADB, MICCAI 2009. [5] Claude, IEEE Trans. Bio.Eng. 51(4) (2004). [6] Habas, MICCAI (Pt. 1) 2008. [7]Bertelsen, ISMRM 2009 [8] Bach Cuadra, IADB, MICCAI 2009.[9] Styner, IEEE TMI 19(39 (2000). [10] Lee, IEEE Trans.Visual. And Comp. Graph. 3(3), 1997, [11] Chan, IEEETrans. Img. Proc, 10(2), 2001 [12] Freesurfer,http://surfer.nmr.mgh.harvard.edu.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The large and growing number of digital images is making manual image search laborious. Only a fraction of the images contain metadata that can be used to search for a particular type of image. Thus, the main research question of this thesis is whether it is possible to learn visual object categories directly from images. Computers process images as long lists of pixels that do not have a clear connection to high-level semantics which could be used in the image search. There are various methods introduced in the literature to extract low-level image features and also approaches to connect these low-level features with high-level semantics. One of these approaches is called Bag-of-Features which is studied in the thesis. In the Bag-of-Features approach, the images are described using a visual codebook. The codebook is built from the descriptions of the image patches using clustering. The images are described by matching descriptions of image patches with the visual codebook and computing the number of matches for each code. In this thesis, unsupervised visual object categorisation using the Bag-of-Features approach is studied. The goal is to find groups of similar images, e.g., images that contain an object from the same category. The standard Bag-of-Features approach is improved by using spatial information and visual saliency. It was found that the performance of the visual object categorisation can be improved by using spatial information of local features to verify the matches. However, this process is computationally heavy, and thus, the number of images must be limited in the spatial matching, for example, by using the Bag-of-Features method as in this study. Different approaches for saliency detection are studied and a new method based on the Hessian-Affine local feature detector is proposed. The new method achieves comparable results with current state-of-the-art. The visual object categorisation performance was improved by using foreground segmentation based on saliency information, especially when the background could be considered as clutter.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Kandidaatintyö tehtiin osana PulpVision-tutkimusprojektia, jonka tarkoituksena on kehittää kuvapohjaisia laskenta- ja luokittelumetodeja sellun laaduntarkkailuun paperin valmistuksessa. Tämän tutkimusprojektin osana on aiemmin kehitetty metodi, jolla etsittiin kaarevia rakenteita kuvista, ja tätä metodia hyödynnettiin kuitujen etsintään kuvista. Tätä metodia käytettiin lähtökohtana kandidaatintyölle. Työn tarkoituksena oli tutkia, voidaanko erilaisista kuitukuvista laskettujen piirteiden avulla tunnistaa kuvassa olevien kuitujen laji. Näissä kuitukuvissa oli kuituja neljästä eri puulajista ja yhdestä kasvista. Nämä lajit olivat akasia, koivu, mänty, eukalyptus ja vehnä. Jokaisesta lajista valittiin 100 kuitukuvaa ja nämä kuvat jaettiin kahteen ryhmään, joista ensimmäistä käytettiin opetusryhmänä ja toista testausryhmänä. Opetusryhmän avulla jokaiselle kuitulajille laskettiin näitä kuvaavia piirteitä, joiden avulla pyrittiin tunnistamaan testausryhmän kuvissa olevat kuitulajit. Nämä kuvat oli tuottanut CEMIS-Oulu (Center for Measurement and Information Systems), joka on mittaustekniikkaan keskittynyt yksikkö Oulun yliopistossa. Yksittäiselle opetusryhmän kuitukuvalle laskettiin keskiarvot ja keskihajonnat kolmesta eri piirteestä, jotka olivat pituus, leveys ja kaarevuus. Lisäksi laskettiin, kuinka monta kuitua kuvasta löydettiin. Näiden piirteiden eri yhdistelmien avulla testattiin tunnistamisen tarkkuutta käyttämällä k:n lähimmän naapurin menetelmää ja Naiivi Bayes -luokitinta testausryhmän kuville. Testeistä saatiin lupaavia tuloksia muun muassa pituuden ja leveyden keskiarvoja käytettäessä saavutettiin jopa noin 98 %:n tarkkuus molemmilla algoritmeilla. Tunnistuksessa kuitujen keskimäärinen pituus vaikutti olevan kuitukuvia parhaiten kuvaava piirre. Käytettyjen algoritmien välillä ei ollut suurta vaihtelua tarkkuudessa. Testeissä saatujen tulosten perusteella voidaan todeta, että kuitukuvien tunnistaminen on mahdollista. Testien perusteella kuitukuvista tarvitsee laskea vain kaksi piirrettä, joilla kuidut voidaan tunnistaa tarkasti. Käytetyt lajittelualgoritmit olivat hyvin yksinkertaisia, mutta ne toimivat testeissä hyvin.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El processament d'imatges mèdiques és una important àrea de recerca. El desenvolupament de noves tècniques que assisteixin i millorin la interpretació visual de les imatges de manera ràpida i precisa és fonamental en entorns clínics reals. La majoria de contribucions d'aquesta tesi són basades en Teoria de la Informació. Aquesta teoria tracta de la transmissió, l'emmagatzemament i el processament d'informació i és usada en camps tals com física, informàtica, matemàtica, estadística, biologia, gràfics per computador, etc. En aquesta tesi, es presenten nombroses eines basades en la Teoria de la Informació que milloren els mètodes existents en l'àrea del processament d'imatges, en particular en els camps del registre i la segmentació d'imatges. Finalment es presenten dues aplicacions especialitzades per l'assessorament mèdic que han estat desenvolupades en el marc d'aquesta tesi.