2 resultados para Natural language techniques, Semantic spaces, Random projection, Documents
Resumo:
This paper addresses the problem of colorectal tumour segmentation in complex real world imagery. For efficient segmentation, a multi-scale strategy is developed for extracting the potentially cancerous region of interest (ROI) based on colour histograms while searching for the best texture resolution. To achieve better segmentation accuracy, we apply a novel bag-of-visual-words method based on rotation invariant raw statistical features and random projection based l2-norm sparse representation to classify tumour areas in histopathology images. Experimental results on 20 real world digital slides demonstrate that the proposed algorithm results in better recognition accuracy than several state of the art segmentation techniques.
Resumo:
Community-driven Question Answering (CQA) systems that crowdsource experiential information in the form of questions and answers and have accumulated valuable reusable knowledge. Clustering of QA datasets from CQA systems provides a means of organizing the content to ease tasks such as manual curation and tagging. In this paper, we present a clustering method that exploits the two-part question-answer structure in QA datasets to improve clustering quality. Our method, {\it MixKMeans}, composes question and answer space similarities in a way that the space on which the match is higher is allowed to dominate. This construction is motivated by our observation that semantic similarity between question-answer data (QAs) could get localized in either space. We empirically evaluate our method on a variety of real-world labeled datasets. Our results indicate that our method significantly outperforms state-of-the-art clustering methods for the task of clustering question-answer archives.