5 resultados para Functorial Embedding
em DRUM (Digital Repository at the University of Maryland)
Resumo:
While news stories are an important traditional medium to broadcast and consume news, microblogging has recently emerged as a place where people can dis- cuss, disseminate, collect or report information about news. However, the massive information in the microblogosphere makes it hard for readers to keep up with these real-time updates. This is especially a problem when it comes to breaking news, where people are more eager to know “what is happening”. Therefore, this dis- sertation is intended as an exploratory effort to investigate computational methods to augment human effort when monitoring the development of breaking news on a given topic from a microblog stream by extractively summarizing the updates in a timely manner. More specifically, given an interest in a topic, either entered as a query or presented as an initial news report, a microblog temporal summarization system is proposed to filter microblog posts from a stream with three primary concerns: topical relevance, novelty, and salience. Considering the relatively high arrival rate of microblog streams, a cascade framework consisting of three stages is proposed to progressively reduce quantity of posts. For each step in the cascade, this dissertation studies methods that improve over current baselines. In the relevance filtering stage, query and document expansion techniques are applied to mitigate sparsity and vocabulary mismatch issues. The use of word embedding as a basis for filtering is also explored, using unsupervised and supervised modeling to characterize lexical and semantic similarity. In the novelty filtering stage, several statistical ways of characterizing novelty are investigated and ensemble learning techniques are used to integrate results from these diverse techniques. These results are compared with a baseline clustering approach using both standard and delay-discounted measures. In the salience filtering stage, because of the real-time prediction requirement a method of learning verb phrase usage from past relevant news reports is used in conjunction with some standard measures for characterizing writing quality. Following a Cranfield-like evaluation paradigm, this dissertation includes a se- ries of experiments to evaluate the proposed methods for each step, and for the end- to-end system. New microblog novelty and salience judgments are created, building on existing relevance judgments from the TREC Microblog track. The results point to future research directions at the intersection of social media, computational jour- nalism, information retrieval, automatic summarization, and machine learning.
Resumo:
The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.
Resumo:
Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state of the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state of the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.
Resumo:
Secure Multi-party Computation (MPC) enables a set of parties to collaboratively compute, using cryptographic protocols, a function over their private data in a way that the participants do not see each other's data, they only see the final output. Typical MPC examples include statistical computations over joint private data, private set intersection, and auctions. While these applications are examples of monolithic MPC, richer MPC applications move between "normal" (i.e., per-party local) and "secure" (i.e., joint, multi-party secure) modes repeatedly, resulting overall in mixed-mode computations. For example, we might use MPC to implement the role of the dealer in a game of mental poker -- the game will be divided into rounds of local decision-making (e.g. bidding) and joint interaction (e.g. dealing). Mixed-mode computations are also used to improve performance over monolithic secure computations. Starting with the Fairplay project, several MPC frameworks have been proposed in the last decade to help programmers write MPC applications in a high-level language, while the toolchain manages the low-level details. However, these frameworks are either not expressive enough to allow writing mixed-mode applications or lack formal specification, and reasoning capabilities, thereby diminishing the parties' trust in such tools, and the programs written using them. Furthermore, none of the frameworks provides a verified toolchain to run the MPC programs, leaving the potential of security holes that can compromise the privacy of parties' data. This dissertation presents language-based techniques to make MPC more practical and trustworthy. First, it presents the design and implementation of a new MPC Domain Specific Language, called Wysteria, for writing rich mixed-mode MPC applications. Wysteria provides several benefits over previous languages, including a conceptual single thread of control, generic support for more than two parties, high-level abstractions for secret shares, and a fully formalized type system and operational semantics. Using Wysteria, we have implemented several MPC applications, including, for the first time, a card dealing application. The dissertation next presents Wys*, an embedding of Wysteria in F*, a full-featured verification oriented programming language. Wys* improves on Wysteria along three lines: (a) It enables programmers to formally verify the correctness and security properties of their programs. As far as we know, Wys* is the first language to provide verification capabilities for MPC programs. (b) It provides a partially verified toolchain to run MPC programs, and finally (c) It enables the MPC programs to use, with no extra effort, standard language constructs from the host language F*, thereby making it more usable and scalable. Finally, the dissertation develops static analyses that help optimize monolithic MPC programs into mixed-mode MPC programs, while providing similar privacy guarantees as the monolithic versions.
Resumo:
Social network sites (SNS), such as Facebook, Google+ and Twitter, have attracted hundreds of millions of users daily since their appearance. Within SNS, users connect to each other, express their identity, disseminate information and form cooperation by interacting with their connected peers. The increasing popularity and ubiquity of SNS usage and the invaluable user behaviors and connections give birth to many applications and business models. We look into several important problems within the social network ecosystem. The first one is the SNS advertisement allocation problem. The other two are related to trust mechanisms design in social network setting, including local trust inference and global trust evaluation. In SNS advertising, we study the problem of advertisement allocation from the ad platform's angle, and discuss its differences with the advertising model in the search engine setting. By leveraging the connection between social networks and hyperbolic geometry, we propose to solve the problem via approximation using hyperbolic embedding and convex optimization. A hyperbolic embedding method, \hcm, is designed for the SNS ad allocation problem, and several components are introduced to realize the optimization formulation. We show the advantages of our new approach in solving the problem compared to the baseline integer programming (IP) formulation. In studying the problem of trust mechanisms in social networks, we consider the existence of distrust (i.e. negative trust) relationships, and differentiate between the concept of local trust and global trust in social network setting. In the problem of local trust inference, we propose a 2-D trust model. Based on the model, we develop a semiring-based trust inference framework. In global trust evaluation, we consider a general setting with conflicting opinions, and propose a consensus-based approach to solve the complex problem in signed trust networks.