2 resultados para multi-class classification

em DRUM (Digital Repository at the University of Maryland)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Authentication plays an important role in how we interact with computers, mobile devices, the web, etc. The idea of authentication is to uniquely identify a user before granting access to system privileges. For example, in recent years more corporate information and applications have been accessible via the Internet and Intranet. Many employees are working from remote locations and need access to secure corporate files. During this time, it is possible for malicious or unauthorized users to gain access to the system. For this reason, it is logical to have some mechanism in place to detect whether the logged-in user is the same user in control of the user's session. Therefore, highly secure authentication methods must be used. We posit that each of us is unique in our use of computer systems. It is this uniqueness that is leveraged to "continuously authenticate users" while they use web software. To monitor user behavior, n-gram models are used to capture user interactions with web-based software. This statistical language model essentially captures sequences and sub-sequences of user actions, their orderings, and temporal relationships that make them unique by providing a model of how each user typically behaves. Users are then continuously monitored during software operations. Large deviations from "normal behavior" can possibly indicate malicious or unintended behavior. This approach is implemented in a system called Intruder Detector (ID) that models user actions as embodied in web logs generated in response to a user's actions. User identification through web logs is cost-effective and non-intrusive. We perform experiments on a large fielded system with web logs of approximately 4000 users. For these experiments, we use two classification techniques; binary and multi-class classification. We evaluate model-specific differences of user behavior based on coarse-grain (i.e., role) and fine-grain (i.e., individual) analysis. A specific set of metrics are used to provide valuable insight into how each model performs. Intruder Detector achieves accurate results when identifying legitimate users and user types. This tool is also able to detect outliers in role-based user behavior with optimal performance. In addition to web applications, this continuous monitoring technique can be used with other user-based systems such as mobile devices and the analysis of network traffic.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.