992 resultados para similarity search


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many traffic situations require drivers to cross or merge into a stream having higher priority. Gap acceptance theory enables us to model such processes to analyse traffic operation. This discussion demonstrated that numerical search fine tuned by statistical analysis can be used to determine the most likely critical gap for a sample of drivers, based on their largest rejected gap and accepted gap. This method shares some common features with the Maximum Likelihood Estimation technique (Troutbeck 1992) but lends itself well to contemporary analysis tools such as spreadsheet and is particularly analytically transparent. This method is considered not to bias estimation of critical gap due to very small rejected gaps or very large rejected gaps. However, it requires a sufficiently large sample that there is reasonable representation of largest rejected gap/accepted gap pairs within a fairly narrow highest likelihood search band.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of choosing, sequentially, a map which assigns elements of a set A to a few elements of a set B. On each round, the algorithm suffers some cost associated with the chosen assignment, and the goal is to minimize the cumulative loss of these choices relative to the best map on the entire sequence. Even though the offline problem of finding the best map is provably hard, we show that there is an equivalent online approximation algorithm, Randomized Map Prediction (RMP), that is efficient and performs nearly as well. While drawing upon results from the "Online Prediction with Expert Advice" setting, we show how RMP can be utilized as an online approach to several standard batch problems. We apply RMP to online clustering as well as online feature selection and, surprisingly, RMP often outperforms the standard batch algorithms on these problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the present paper, we introduce BioPatML.NET, an application library for the Microsoft Windows .NET framework [2] that implements the BioPatML pattern definition language and sequence search engine. BioPatML.NET is integrated with the Microsoft Biology Foundation (MBF) application library [3], unifying the parsers and annotation services supported or emerging through MBF with the language, search framework and pattern repository of BioPatML. End users who wish to exploit the BioPatML.NET engine and repository without engaging the services of a programmer may do so via the freely accessible web-based BioPatML Editor, which we describe below.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes an innovative instance similarity based evaluation metric that reduces the search map for clustering to be performed. An aggregate global score is calculated for each instance using the novel idea of Fibonacci series. The use of Fibonacci numbers is able to separate the instances effectively and, in hence, the intra-cluster similarity is increased and the inter-cluster similarity is decreased during clustering. The proposed FIBCLUS algorithm is able to handle datasets with numerical, categorical and a mix of both types of attributes. Results obtained with FIBCLUS are compared with the results of existing algorithms such as k-means, x-means expected maximization and hierarchical algorithms that are widely used to cluster numeric, categorical and mix data types. Empirical analysis shows that FIBCLUS is able to produce better clustering solutions in terms of entropy, purity and F-score in comparison to the above described existing algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most recommendation methods employ item-item similarity measures or use ratings data to generate recommendations. These methods use traditional two dimensional models to find inter relationships between alike users and products. This paper proposes a novel recommendation method using the multi-dimensional model, tensor, to group similar users based on common search behaviour, and then finding associations within such groups for making effective inter group recommendations. Web log data is multi-dimensional data. Unlike vector based methods, tensors have the ability to highly correlate and find latent relationships between such similar instances, consisting of users and searches. Non redundant rules from such associations of user-searches are then used for making recommendations to the users.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The growing importance and need of data processing for information extraction is vital for Web databases. Due to the sheer size and volume of databases, retrieval of relevant information as needed by users has become a cumbersome process. Information seekers are faced by information overloading - too many result sets are returned for their queries. Moreover, too few or no results are returned if a specific query is asked. This paper proposes a ranking algorithm that gives higher preference to a user’s current search and also utilizes profile information in order to obtain the relevant results for a user’s query.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose to use the Tensor Space Modeling (TSM) to represent and analyze the user’s web log data that consists of multiple interests and spans across multiple dimensions. Further we propose to use the decomposition factors of the Tensors for clustering the users based on similarity of search behaviour. Preliminary results show that the proposed method outperforms the traditional Vector Space Model (VSM) based clustering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses human factors issues of low cost railway level crossings in Australia. Several issues are discussed in this paper including safety at passive level railway crossings, human factors considerations associated with unavailability of a warning device, and a conceptual model for how safety could be compromised at railway level crossings following prolonged or frequent unavailability. The research plans to quantify safety risk to motorists at level crossings using a Human Reliability Assessment (HRA) method, supported by data collected using an advanced driving simulator. This method aims to identify human error within tasks and task units identified as part of the task analysis process. It is anticipated that by modelling driver behaviour the current study will be able to quantify meaningful task variability including temporal parameters, between participants and within participants. The process of complex tasks such as driving through a level crossing is fundamentally context-bound. Therefore this study also aims to quantify those performance-shaping factors that contribute to vehicle train collisions by highlighting changes in the task units and driver physiology. Finally we will also consider a number of variables germane to ensuring external validity of our results. Without this inclusion, such an analysis could seriously underestimate the probabilistic risk assessment.