4 resultados para greedy-rotation-greedy (GRG)
em DRUM (Digital Repository at the University of Maryland)
Resumo:
The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.
Resumo:
Safe operation of unmanned aerial vehicles (UAVs) over populated areas requires reducing the risk posed by a UAV if it crashed during its operation. We considered several types of UAV risk-based path planning problems and developed techniques for estimating the risk to third parties on the ground. The path planning problem requires making trade-offs between risk and flight time. Four optimization approaches for solving the problem were tested; a network-based approach that used a greedy algorithm to improve the original solution generated the best solutions with the least computational effort. Additionally, an approach for solving a combined design and path planning problems was developed and tested. This approach was extended to solve robust risk-based path planning problem in which uncertainty about wind conditions would affect the risk posed by a UAV.
Resumo:
Object recognition has long been a core problem in computer vision. To improve object spatial support and speed up object localization for object recognition, generating high-quality category-independent object proposals as the input for object recognition system has drawn attention recently. Given an image, we generate a limited number of high-quality and category-independent object proposals in advance and used as inputs for many computer vision tasks. We present an efficient dictionary-based model for image classification task. We further extend the work to a discriminative dictionary learning method for tensor sparse coding. In the first part, a multi-scale greedy-based object proposal generation approach is presented. Based on the multi-scale nature of objects in images, our approach is built on top of a hierarchical segmentation. We first identify the representative and diverse exemplar clusters within each scale. Object proposals are obtained by selecting a subset from the multi-scale segment pool via maximizing a submodular objective function, which consists of a weighted coverage term, a single-scale diversity term and a multi-scale reward term. The weighted coverage term forces the selected set of object proposals to be representative and compact; the single-scale diversity term encourages choosing segments from different exemplar clusters so that they will cover as many object patterns as possible; the multi-scale reward term encourages the selected proposals to be discriminative and selected from multiple layers generated by the hierarchical image segmentation. The experimental results on the Berkeley Segmentation Dataset and PASCAL VOC2012 segmentation dataset demonstrate the accuracy and efficiency of our object proposal model. Additionally, we validate our object proposals in simultaneous segmentation and detection and outperform the state-of-art performance. To classify the object in the image, we design a discriminative, structural low-rank framework for image classification. We use a supervised learning method to construct a discriminative and reconstructive dictionary. By introducing an ideal regularization term, we perform low-rank matrix recovery for contaminated training data from all categories simultaneously without losing structural information. A discriminative low-rank representation for images with respect to the constructed dictionary is obtained. With semantic structure information and strong identification capability, this representation is good for classification tasks even using a simple linear multi-classifier.
Resumo:
In many major cities, fixed route transit systems such as bus and rail serve millions of trips per day. These systems have people collect at common locations (the station or stop), and board at common times (for example according to a predetermined schedule or headway). By using common service locations and times, these modes can consolidate many trips that have similar origins and destinations or overlapping routes. However, the routes are not sensitive to changing travel patterns, and have no way of identifying which trips are going unserved, or are poorly served, by the existing routes. On the opposite end of the spectrum, personal modes of transportation, such as a private vehicle or taxi, offer service to and from the exact origin and destination of a rider, at close to exactly the time they desire to travel. Despite the apparent increased convenience to users, the presence of a large number of small vehicles results in a disorganized, and potentially congested road network during high demand periods. The focus of the research presented in this paper is to develop a system that possesses both the on-demand nature of a personal mode, with the efficiency of shared modes. In this system, users submit their request for travel, but are asked to make small compromises in their origin and destination location by walking to a nearby meeting point, as well as slightly modifying their time of travel, in order to accommodate other passengers. Because the origin and destination location of the request can be adjusted, this is a more general case of the Dial-a-Ride problem with time windows. The solution methodology uses a graph clustering algorithm coupled with a greedy insertion technique. A case study is presented using actual requests for taxi trips in Washington DC, and shows a significant decrease in the number of vehicles required to serve the demand.