899 resultados para Information Filtering, Pattern Mining, Relevance Feature Discovery, Text Mining


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Short bibliography at end of each chapter.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Shipping list no.: 2002-0451-M.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

description accompanying photograph: picture shows Quinnesec Falls and hydraulic power works

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Kept up to date by supplements through 1969.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Kalman inverse filtering is used to develop a methodology for real-time estimation of forces acting at the interface between tyre and road on large off-highway mining trucks. The system model formulated is capable of estimating the three components of tyre-force at each wheel of the truck using a practical set of measurements and inputs. Good tracking is obtained by the estimated tyre-forces when compared with those simulated by an ADAMS virtual-truck model. A sensitivity analysis determines the susceptibility of the tyre-force estimates to uncertainties in the truck's parameters.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Document ranking is an important process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. Traditional document ranking methods are mostly based on the similarity computations between documents and query. In this paper we argue that the similarity-based document ranking is insufficient in some cases. There are two reasons. Firstly it is about the increased information variety. There are far too many different types documents available now for user to search. The second is about the users variety. In many cases user may want to retrieve documents that are not only similar but also general or broad regarding a certain topic. This is particularly the case in some domains such as bio-medical IR. In this paper we propose a novel approach to re-rank the retrieved documents by incorporating the similarity with their generality. By an ontology-based analysis on the semantic cohesion of text, document generality can be quantified. The retrieved documents are then re-ranked by their combined scores of similarity and the closeness of documents’ generality to the query’s. Our experiments have shown an encouraging performance on a large bio-medical document collection, OHSUMED, containing 348,566 medical journal references and 101 test queries.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index imagersquos multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partitionrsquos center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images haves similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the ldquodimensionality curserdquo existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms imagersquos text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partitionrsquos center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There has been an increased demand for characterizing user access patterns using web mining techniques since the informative knowledge extracted from web server log files can not only offer benefits for web site structure improvement but also for better understanding of user navigational behavior. In this paper, we present a web usage mining method, which utilize web user usage and page linkage information to capture user access pattern based on Probabilistic Latent Semantic Analysis (PLSA) model. A specific probabilistic model analysis algorithm, EM algorithm, is applied to the integrated usage data to infer the latent semantic factors as well as generate user session clusters for revealing user access patterns. Experiments have been conducted on real world data set to validate the effectiveness of the proposed approach. The results have shown that the presented method is capable of characterizing the latent semantic factors and generating user profile in terms of weighted page vectors, which may reflect the common access interest exhibited by users among same session cluster.