1 resultado para Short-text clustering
em Cochin University of Science
Filtro por publicador
- Repository Napier (1)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (1)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (1)
- Archimer: Archive de l'Institut francais de recherche pour l'exploitation de la mer (2)
- Aston University Research Archive (7)
- Avian Conservation and Ecology - Eletronic Cientific Hournal - Écologie et conservation des oiseaux: (3)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (1)
- Biodiversity Heritage Library, United States (1)
- Blue Tiger Commons - Lincoln University - USA (1)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (3)
- Boston University Digital Common (1)
- Brock University, Canada (3)
- Bucknell University Digital Commons - Pensilvania - USA (10)
- Bulgarian Digital Mathematics Library at IMI-BAS (2)
- Cambridge University Engineering Department Publications Database (1)
- CentAUR: Central Archive University of Reading - UK (42)
- Chinese Academy of Sciences Institutional Repositories Grid Portal (2)
- Cochin University of Science & Technology (CUSAT), India (1)
- Collection Of Biostatistics Research Archive (3)
- CUNY Academic Works (1)
- Dalarna University College Electronic Archive (7)
- Deakin Research Online - Australia (26)
- DI-fusion - The institutional repository of Université Libre de Bruxelles (4)
- Digital Archives@Colby (12)
- Digital Commons - Michigan Tech (5)
- Digital Commons @ DU | University of Denver Research (3)
- Digital Commons at Florida International University (25)
- Digital Howard @ Howard University | Howard University Research (1)
- Digital Repository at Iowa State University (1)
- DigitalCommons - The University of Maine Research (5)
- DigitalCommons@The Texas Medical Center (14)
- DigitalCommons@University of Nebraska - Lincoln (6)
- Digitale Sammlungen - Goethe-Universität Frankfurt am Main (7)
- FUNDAJ - Fundação Joaquim Nabuco (1)
- Greenwich Academic Literature Archive - UK (1)
- Harvard University (1)
- Helda - Digital Repository of University of Helsinki (18)
- Illinois Digital Environment for Access to Learning and Scholarship Repository (1)
- Indian Institute of Science - Bangalore - Índia (7)
- Institutional Repository of Leibniz University Hannover (1)
- Instituto Politécnico do Porto, Portugal (1)
- Memoria Académica - FaHCE, UNLP - Argentina (3)
- National Center for Biotechnology Information - NCBI (53)
- Plymouth Marine Science Electronic Archive (PlyMSEA) (7)
- Portal de Revistas Científicas Complutenses - Espanha (3)
- Publishing Network for Geoscientific & Environmental Data (61)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (3)
- Queensland University of Technology - ePrints Archive (32)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (2)
- Research Open Access Repository of the University of East London. (3)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (1)
- School of Medicine, Washington University, United States (6)
- Scielo España (1)
- Scientific Open-access Literature Archive and Repository (1)
- The Scholarly Commons | School of Hotel Administration; Cornell University Research (2)
- Universidad Politécnica de Madrid (1)
- Universitat de Girona, Spain (2)
- Université de Lausanne, Switzerland (1)
- University of Connecticut - USA (1)
- University of Michigan (549)
- University of Southampton, United Kingdom (1)
- WestminsterResearch - UK (1)
Resumo:
In this paper a method of copy detection in short Malayalam text passages is proposed. Given two passages one as the source text and another as the copied text it is determined whether the second passage is plagiarized version of the source text. An algorithm for plagiarism detection using the n-gram model for word retrieval is developed and found tri-grams as the best model for comparing the Malayalam text. Based on the probability and the resemblance measures calculated from the n-gram comparison , the text is categorized on a threshold. Texts are compared by variable length n-gram(n={2,3,4}) comparisons. The experiments show that trigram model gives the average acceptable performance with affordable cost in terms of complexity