833 resultados para distinctness of image


Relevância:

80.00% 80.00%

Publicador:

Resumo:

In a crescent connected and image-made society in various online social networks, each user taking pictures of him/herself and posting them on social online network, makes a self-registration that emerges in self-representation by aggregated daily practices of various themes. This research aims to study the dynamics of the relations of this photographic self-representation relations with agency, technology, affectivity, and consumption that goes beyond narcissism, composing a being integrated with the visual power and its reverberations, which I hereby call as Being -Image. The goal of this paper is an anthropological look over these practices. In this sense, I carry out the research developed in the field of image and cyber antropology with various interlocutors established in dialogues, analysis and virtual meetings. I tried, besides conversations and interviews, to analyze the profiles on the social network as well as daily posts, photo albums available on online network , sharings and mainly selfies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This research seeks to reflect on the dynamics of television reception, studying the Brazilian TV miniseries Hoje é Dia de Maria, produced by Globo Television Network, and aims to generally promote inferences in the process of image reading, mainly for aesthetic reading in school context, aiming at the formation of visual proficient readers. The research was conducted with students from the third grade of a state high school, geographically located in the city of Natal, Rio Grande do Norte. The theoretical framework comes from the assumptions of cognitive social interactionism to understand language, and it is also based on the ideas of Bakhtin (1992) and Vygotsky (1998), which enabled us to understand the social interaction and the Theory of Aesthetics Reception and Aesthetic Effect with Jauss (1979) and Iser (1999), which provided a better understanding of aesthetic experience, aesthetic effects and production of meaning. The methodological approach assumes a qualitative nature and an interpretive bias, accomplished through interviews, observation, questionnaire and application of a set of investigative activities, such as introductory exposition of themes, handing out of images and mediation process. This research is the result of a research-action process in a pedagogical intervention in a state school. The results indicate that the interactional linguistic resources used by the speakers demonstrated lack of prior knowledge and repertoire regarding image reading, which initially led them to do a cursory reading. It was evident that the respondents were unaware of the initial proposal. However, throughout the meetings, it was possible to realize their transformation, because the pre-established concepts were analyzed with the help of mediation, so that the group felt more autonomous and safe to read images at the end. The survey also showed significant data, so that the school could develop new methods of teaching televisual reading.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Lung cancer is one of the most common types of cancer and has the highest mortality rate. Patient survival is highly correlated with early detection. Computed Tomography technology services the early detection of lung cancer tremendously by offering aminimally invasive medical diagnostic tool. However, the large amount of data per examination makes the interpretation difficult. This leads to omission of nodules by human radiologist. This thesis presents a development of a computer-aided diagnosis system (CADe) tool for the detection of lung nodules in Computed Tomography study. The system, called LCD-OpenPACS (Lung Cancer Detection - OpenPACS) should be integrated into the OpenPACS system and have all the requirements for use in the workflow of health facilities belonging to the SUS (Brazilian health system). The LCD-OpenPACS made use of image processing techniques (Region Growing and Watershed), feature extraction (Histogram of Gradient Oriented), dimensionality reduction (Principal Component Analysis) and classifier (Support Vector Machine). System was tested on 220 cases, totaling 296 pulmonary nodules, with sensitivity of 94.4% and 7.04 false positives per case. The total time for processing was approximately 10 minutes per case. The system has detected pulmonary nodules (solitary, juxtavascular, ground-glass opacity and juxtapleural) between 3 mm and 30 mm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This work discusses the ontology of the visible at the thought of Maurice Merleau-Ponty (1908-1961), which points to a depth and opacity of the perceived world that oppose transparency of geometric world thought by René Descartes (1596-1650). At first we approached the Cartesian discourse developed in Dioptrics Descartes, the first of three scientific discourses published in 1637, being introduced by the famous Discourse method. In this sense, this research discusses the mechanistic explanation that the modern philosopher has the vision, process comprising the formation of images on the retina and its communication to the brain, and the subsequent reading performed by an immaterial mind. Discusses the notion of image as a result of the interpretation of the spirit because, for Descartes, is not the eye that sees, but the spirit that reads and decodes the signals that the body receives the world. At another point, reflected on the criticism of the philosopher Maurice Merleau-Ponty at the thought of overflight present in Dioptrics Descartes. Therefore, it takes as its reference the third part of the book The Eye and the Spirit (1961), in which the intellectualist approach of vision is considered a failed attempt to move away from the visible to rebuild it from anywhere . In this sense, it reflects on a new ontology proposed by Merleau-Ponty thinking being without departing from the puzzles of the body and vision. Puzzles that show a promiscuity between the seer and the seen, between sentient and sensitive. Thus, this paper discusses how visibility was treated by the contemporary philosopher, not as something to be judged by the spirit to get a real nature of things, but as a manifestation of the same things. Finally, this research explores the ontology of the visible in merleaupontiano thought, an ontology that does not rebuild or appropriates visible by a thought of overflight, but what you do from your own visibility as compared original and constant with depth in the world.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This work discusses the ontology of the visible at the thought of Maurice Merleau-Ponty (1908-1961), which points to a depth and opacity of the perceived world that oppose transparency of geometric world thought by René Descartes (1596-1650). At first we approached the Cartesian discourse developed in Dioptrics Descartes, the first of three scientific discourses published in 1637, being introduced by the famous Discourse method. In this sense, this research discusses the mechanistic explanation that the modern philosopher has the vision, process comprising the formation of images on the retina and its communication to the brain, and the subsequent reading performed by an immaterial mind. Discusses the notion of image as a result of the interpretation of the spirit because, for Descartes, is not the eye that sees, but the spirit that reads and decodes the signals that the body receives the world. At another point, reflected on the criticism of the philosopher Maurice Merleau-Ponty at the thought of overflight present in Dioptrics Descartes. Therefore, it takes as its reference the third part of the book The Eye and the Spirit (1961), in which the intellectualist approach of vision is considered a failed attempt to move away from the visible to rebuild it from anywhere . In this sense, it reflects on a new ontology proposed by Merleau-Ponty thinking being without departing from the puzzles of the body and vision. Puzzles that show a promiscuity between the seer and the seen, between sentient and sensitive. Thus, this paper discusses how visibility was treated by the contemporary philosopher, not as something to be judged by the spirit to get a real nature of things, but as a manifestation of the same things. Finally, this research explores the ontology of the visible in merleaupontiano thought, an ontology that does not rebuild or appropriates visible by a thought of overflight, but what you do from your own visibility as compared original and constant with depth in the world.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This dissertation aims to analyze and understand the process and practices of political marketing strategies applied to social media facebook and twitter Cássio Cunha Lima - PSDB candidate for governor of Paraíba, in the 2014 elections The work is divided into three parts . The first two chapters, both of theoretical nature, underlie the discussion about the use of the Internet as a campaign space and political marketing campaign as well as the different communication strategies and electoral marketing already presented in the literature. Following, is dedicated to a topic for the presentation of the methodology and subsequently makes the discussion of empirical data analysis. Finally, we present the conclusions. The analysis takes as its starting point the models Figueiredo et al. (1998) and Albuquerque (1999) to observe the traditional strategies and suggests the inclusion of typically recorded on the Internet strategies. The methodology used for the analysis was the qualitative and quantitative content from variables that we list different campaign strategies. In order to achieve the purpose of this research, we conducted a case study as an analytical object online campaign Cássio Cunha Lima. The case study took place from the construction of a candidate's biographical and political profile, presented and discussed in the text. This research also made use of virtual ethnography. Therefore, were monitored social media facebook and twitter that political, with the help of image capture program - Greenshot by creating pre-defined categories of analysis, for example, calendar, prestige and support, negative campaign , engagement, among others. The period chosen for monitoring the candidate's official profiles was from 24 August to 28 October 2014, because it holds the pre, during and post-election where there was greater candidate drive level and his team marketing in social media selected for analysis. The results indicate that mobilization strategy (online and offline), merged with the promotion schedule, it is predominant in the social media Cassio. They also indicate that they do not show the failure of the campaign of the candidate in 2014.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The content-based image retrieval is important for various purposes like disease diagnoses from computerized tomography, for example. The relevance, social and economic of image retrieval systems has created the necessity of its improvement. Within this context, the content-based image retrieval systems are composed of two stages, the feature extraction and similarity measurement. The stage of similarity is still a challenge due to the wide variety of similarity measurement functions, which can be combined with the different techniques present in the recovery process and return results that aren’t always the most satisfactory. The most common functions used to measure the similarity are the Euclidean and Cosine, but some researchers have noted some limitations in these functions conventional proximity, in the step of search by similarity. For that reason, the Bregman divergences (Kullback Leibler and I-Generalized) have attracted the attention of researchers, due to its flexibility in the similarity analysis. Thus, the aim of this research was to conduct a comparative study over the use of Bregman divergences in relation the Euclidean and Cosine functions, in the step similarity of content-based image retrieval, checking the advantages and disadvantages of each function. For this, it was created a content-based image retrieval system in two stages: offline and online, using approaches BSM, FISM, BoVW and BoVW-SPM. With this system was created three groups of experiments using databases: Caltech101, Oxford and UK-bench. The performance of content-based image retrieval system using the different functions of similarity was tested through of evaluation measures: Mean Average Precision, normalized Discounted Cumulative Gain, precision at k, precision x recall. Finally, this study shows that the use of Bregman divergences (Kullback Leibler and Generalized) obtains better results than the Euclidean and Cosine measures with significant gains for content-based image retrieval.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract

The goal of modern radiotherapy is to precisely deliver a prescribed radiation dose to delineated target volumes that contain a significant amount of tumor cells while sparing the surrounding healthy tissues/organs. Precise delineation of treatment and avoidance volumes is the key for the precision radiation therapy. In recent years, considerable clinical and research efforts have been devoted to integrate MRI into radiotherapy workflow motivated by the superior soft tissue contrast and functional imaging possibility. Dynamic contrast-enhanced MRI (DCE-MRI) is a noninvasive technique that measures properties of tissue microvasculature. Its sensitivity to radiation-induced vascular pharmacokinetic (PK) changes has been preliminary demonstrated. In spite of its great potential, two major challenges have limited DCE-MRI’s clinical application in radiotherapy assessment: the technical limitations of accurate DCE-MRI imaging implementation and the need of novel DCE-MRI data analysis methods for richer functional heterogeneity information.

This study aims at improving current DCE-MRI techniques and developing new DCE-MRI analysis methods for particular radiotherapy assessment. Thus, the study is naturally divided into two parts. The first part focuses on DCE-MRI temporal resolution as one of the key DCE-MRI technical factors, and some improvements regarding DCE-MRI temporal resolution are proposed; the second part explores the potential value of image heterogeneity analysis and multiple PK model combination for therapeutic response assessment, and several novel DCE-MRI data analysis methods are developed.

I. Improvement of DCE-MRI temporal resolution. First, the feasibility of improving DCE-MRI temporal resolution via image undersampling was studied. Specifically, a novel MR image iterative reconstruction algorithm was studied for DCE-MRI reconstruction. This algorithm was built on the recently developed compress sensing (CS) theory. By utilizing a limited k-space acquisition with shorter imaging time, images can be reconstructed in an iterative fashion under the regularization of a newly proposed total generalized variation (TGV) penalty term. In the retrospective study of brain radiosurgery patient DCE-MRI scans under IRB-approval, the clinically obtained image data was selected as reference data, and the simulated accelerated k-space acquisition was generated via undersampling the reference image full k-space with designed sampling grids. Two undersampling strategies were proposed: 1) a radial multi-ray grid with a special angular distribution was adopted to sample each slice of the full k-space; 2) a Cartesian random sampling grid series with spatiotemporal constraints from adjacent frames was adopted to sample the dynamic k-space series at a slice location. Two sets of PK parameters’ maps were generated from the undersampled data and from the fully-sampled data, respectively. Multiple quantitative measurements and statistical studies were performed to evaluate the accuracy of PK maps generated from the undersampled data in reference to the PK maps generated from the fully-sampled data. Results showed that at a simulated acceleration factor of four, PK maps could be faithfully calculated from the DCE images that were reconstructed using undersampled data, and no statistically significant differences were found between the regional PK mean values from undersampled and fully-sampled data sets. DCE-MRI acceleration using the investigated image reconstruction method has been suggested as feasible and promising.

Second, for high temporal resolution DCE-MRI, a new PK model fitting method was developed to solve PK parameters for better calculation accuracy and efficiency. This method is based on a derivative-based deformation of the commonly used Tofts PK model, which is presented as an integrative expression. This method also includes an advanced Kolmogorov-Zurbenko (KZ) filter to remove the potential noise effect in data and solve the PK parameter as a linear problem in matrix format. In the computer simulation study, PK parameters representing typical intracranial values were selected as references to simulated DCE-MRI data for different temporal resolution and different data noise level. Results showed that at both high temporal resolutions (<1s) and clinically feasible temporal resolution (~5s), this new method was able to calculate PK parameters more accurate than the current calculation methods at clinically relevant noise levels; at high temporal resolutions, the calculation efficiency of this new method was superior to current methods in an order of 102. In a retrospective of clinical brain DCE-MRI scans, the PK maps derived from the proposed method were comparable with the results from current methods. Based on these results, it can be concluded that this new method can be used for accurate and efficient PK model fitting for high temporal resolution DCE-MRI.

II. Development of DCE-MRI analysis methods for therapeutic response assessment. This part aims at methodology developments in two approaches. The first one is to develop model-free analysis method for DCE-MRI functional heterogeneity evaluation. This approach is inspired by the rationale that radiotherapy-induced functional change could be heterogeneous across the treatment area. The first effort was spent on a translational investigation of classic fractal dimension theory for DCE-MRI therapeutic response assessment. In a small-animal anti-angiogenesis drug therapy experiment, the randomly assigned treatment/control groups received multiple fraction treatments with one pre-treatment and multiple post-treatment high spatiotemporal DCE-MRI scans. In the post-treatment scan two weeks after the start, the investigated Rényi dimensions of the classic PK rate constant map demonstrated significant differences between the treatment and the control groups; when Rényi dimensions were adopted for treatment/control group classification, the achieved accuracy was higher than the accuracy from using conventional PK parameter statistics. Following this pilot work, two novel texture analysis methods were proposed. First, a new technique called Gray Level Local Power Matrix (GLLPM) was developed. It intends to solve the lack of temporal information and poor calculation efficiency of the commonly used Gray Level Co-Occurrence Matrix (GLCOM) techniques. In the same small animal experiment, the dynamic curves of Haralick texture features derived from the GLLPM had an overall better performance than the corresponding curves derived from current GLCOM techniques in treatment/control separation and classification. The second developed method is dynamic Fractal Signature Dissimilarity (FSD) analysis. Inspired by the classic fractal dimension theory, this method measures the dynamics of tumor heterogeneity during the contrast agent uptake in a quantitative fashion on DCE images. In the small animal experiment mentioned before, the selected parameters from dynamic FSD analysis showed significant differences between treatment/control groups as early as after 1 treatment fraction; in contrast, metrics from conventional PK analysis showed significant differences only after 3 treatment fractions. When using dynamic FSD parameters, the treatment/control group classification after 1st treatment fraction was improved than using conventional PK statistics. These results suggest the promising application of this novel method for capturing early therapeutic response.

The second approach of developing novel DCE-MRI methods is to combine PK information from multiple PK models. Currently, the classic Tofts model or its alternative version has been widely adopted for DCE-MRI analysis as a gold-standard approach for therapeutic response assessment. Previously, a shutter-speed (SS) model was proposed to incorporate transcytolemmal water exchange effect into contrast agent concentration quantification. In spite of richer biological assumption, its application in therapeutic response assessment is limited. It might be intriguing to combine the information from the SS model and from the classic Tofts model to explore potential new biological information for treatment assessment. The feasibility of this idea was investigated in the same small animal experiment. The SS model was compared against the Tofts model for therapeutic response assessment using PK parameter regional mean value comparison. Based on the modeled transcytolemmal water exchange rate, a biological subvolume was proposed and was automatically identified using histogram analysis. Within the biological subvolume, the PK rate constant derived from the SS model were proved to be superior to the one from Tofts model in treatment/control separation and classification. Furthermore, novel biomarkers were designed to integrate PK rate constants from these two models. When being evaluated in the biological subvolume, this biomarker was able to reflect significant treatment/control difference in both post-treatment evaluation. These results confirm the potential value of SS model as well as its combination with Tofts model for therapeutic response assessment.

In summary, this study addressed two problems of DCE-MRI application in radiotherapy assessment. In the first part, a method of accelerating DCE-MRI acquisition for better temporal resolution was investigated, and a novel PK model fitting algorithm was proposed for high temporal resolution DCE-MRI. In the second part, two model-free texture analysis methods and a multiple-model analysis method were developed for DCE-MRI therapeutic response assessment. The presented works could benefit the future DCE-MRI routine clinical application in radiotherapy assessment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Dirichlet distribution is a multivariate generalization of the Beta distribution. It is an important multivariate continuous distribution in probability and statistics. In this report, we review the Dirichlet distribution and study its properties, including statistical and information-theoretic quantities involving this distribution. Also, relationships between the Dirichlet distribution and other distributions are discussed. There are some different ways to think about generating random variables with a Dirichlet distribution. The stick-breaking approach and the Pólya urn method are discussed. In Bayesian statistics, the Dirichlet distribution and the generalized Dirichlet distribution can both be a conjugate prior for the Multinomial distribution. The Dirichlet distribution has many applications in different fields. We focus on the unsupervised learning of a finite mixture model based on the Dirichlet distribution. The Initialization Algorithm and Dirichlet Mixture Estimation Algorithm are both reviewed for estimating the parameters of a Dirichlet mixture. Three experimental results are shown for the estimation of artificial histograms, summarization of image databases and human skin detection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective
Pedestrian detection under video surveillance systems has always been a hot topic in computer vision research. These systems are widely used in train stations, airports, large commercial plazas, and other public places. However, pedestrian detection remains difficult because of complex backgrounds. Given its development in recent years, the visual attention mechanism has attracted increasing attention in object detection and tracking research, and previous studies have achieved substantial progress and breakthroughs. We propose a novel pedestrian detection method based on the semantic features under the visual attention mechanism.
Method
The proposed semantic feature-based visual attention model is a spatial-temporal model that consists of two parts: the static visual attention model and the motion visual attention model. The static visual attention model in the spatial domain is constructed by combining bottom-up with top-down attention guidance. Based on the characteristics of pedestrians, the bottom-up visual attention model of Itti is improved by intensifying the orientation vectors of elementary visual features to make the visual saliency map suitable for pedestrian detection. In terms of pedestrian attributes, skin color is selected as a semantic feature for pedestrian detection. The regional and Gaussian models are adopted to construct the skin color model. Skin feature-based visual attention guidance is then proposed to complete the top-down process. The bottom-up and top-down visual attentions are linearly combined using the proper weights obtained from experiments to construct the static visual attention model in the spatial domain. The spatial-temporal visual attention model is then constructed via the motion features in the temporal domain. Based on the static visual attention model in the spatial domain, the frame difference method is combined with optical flowing to detect motion vectors. Filtering is applied to process the field of motion vectors. The saliency of motion vectors can be evaluated via motion entropy to make the selected motion feature more suitable for the spatial-temporal visual attention model.
Result
Standard datasets and practical videos are selected for the experiments. The experiments are performed on a MATLAB R2012a platform. The experimental results show that our spatial-temporal visual attention model demonstrates favorable robustness under various scenes, including indoor train station surveillance videos and outdoor scenes with swaying leaves. Our proposed model outperforms the visual attention model of Itti, the graph-based visual saliency model, the phase spectrum of quaternion Fourier transform model, and the motion channel model of Liu in terms of pedestrian detection. The proposed model achieves a 93% accuracy rate on the test video.
Conclusion
This paper proposes a novel pedestrian method based on the visual attention mechanism. A spatial-temporal visual attention model that uses low-level and semantic features is proposed to calculate the saliency map. Based on this model, the pedestrian targets can be detected through focus of attention shifts. The experimental results verify the effectiveness of the proposed attention model for detecting pedestrians.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the most important civic phenomena emerging from favelas in Rio de Janeiro today is “community (photo)journalism”, which is practised by favela residents who are trained in journalistic and artistic techniques to raise critical awareness and promote political mobilisation in- and outside favelas. This paper looks at some of the work produced at one training place for community photographers, the agency-school Imagens do Povo (“Images of the People”) in Nova Holanda, a favela located in Rio’s North Zone. Using an ethnographic approach, this article first provides an account of the working practices of the School and its photographers. This is followed by a discussion of a small sample of their photographic work, for which we employ a social semiotic paradigm of image analysis. This methodological synergy provides insights into how these journalists document long-term structural as well as “spectacular” violence in favelas, while at the same time striving to capture some of the “beauty” of these communities. The paper concludes that this form of photographic work constitutes an important step towards a more analytical brand of journalism with different news values that encourage a more context-sensitive approach to covering urban violence and favela life.

KEYWORDS: alternative media, Imagens do Povo, multimodality, news values, photojournalism.







Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sharpening is a powerful image transformation because sharp edges can bring out image details. Sharpness is achieved by increasing local contrast and reducing edge widths. We present a method that enhances sharpness of images and thereby their perceptual quality. Most existing enhancement techniques require user input to improve the perception of the scene in a manner most pleasing to the particular user. Our goal of image enhancement is to improve the perception of sharpness in digital images for human viewers. We consider two parameters in order to exaggerate the differences between local intensities. The two parameters exploit local contrast and widths of edges. We start from the assumption that color, texture, or objects of focus such as faces affect the human perception of photographs. When human raters are presented with a collection of images with different sharpness and asked to rank them according to perceived sharpness, the results have shown that there is a statistical consensus among the raters. We introduce a ramp enhancement technique by modifying the optimal overshoot in the ramp for different region contrasts as well as the new ramp width. Optimal parameter values are searched to be applied to regions under the criteria mentioned above. In this way, we aim to enhance digital images automatically to create pleasing image output for common users.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of digital image processing techniques is prominent in medical settings for the automatic diagnosis of diseases. Glaucoma is the second leading cause of blindness in the world and it has no cure. Currently, there are treatments to prevent vision loss, but the disease must be detected in the early stages. Thus, the objective of this work is to develop an automatic detection method of Glaucoma in retinal images. The methodology used in the study were: acquisition of image database, Optic Disc segmentation, texture feature extraction in different color models and classification of images in glaucomatous or not. We obtained results of 93% accuracy

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Trabalho Final de Mestrado para obtenção do Grau de Mestre em Engenharia de Redes de Comunicação e Multimédia