1 resultado para Vector Space IR, Search Engines, Document Clustering, Document
em Digital Commons - Michigan Tech
Filtro por publicador
- ABACUS. Repositorio de Producción Científica - Universidad Europea (1)
- Academic Research Repository at Institute of Developing Economies (1)
- Acceda, el repositorio institucional de la Universidad de Las Palmas de Gran Canaria. España (1)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (4)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (3)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (1)
- Archive of European Integration (252)
- Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco (1)
- Aston University Research Archive (14)
- B-Digital - Universidade Fernando Pessoa - Portugal (15)
- Biblioteca de Teses e Dissertações da USP (1)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (8)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (4)
- Biblioteca Virtual del Sistema Sanitario Público de Andalucía (BV-SSPA), Junta de Andalucía. Consejería de Salud y Bienestar Social, Spain (3)
- Bioline International (1)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (17)
- Brock University, Canada (11)
- Bulgarian Digital Mathematics Library at IMI-BAS (16)
- CentAUR: Central Archive University of Reading - UK (9)
- Cochin University of Science & Technology (CUSAT), India (6)
- Consorci de Serveis Universitaris de Catalunya (CSUC), Spain (155)
- CORA - Cork Open Research Archive - University College Cork - Ireland (1)
- CUNY Academic Works (1)
- Dalarna University College Electronic Archive (1)
- Digital Commons - Michigan Tech (1)
- Digital Commons @ DU | University of Denver Research (1)
- Digital Commons at Florida International University (5)
- Digital Peer Publishing (5)
- DigitalCommons@The Texas Medical Center (5)
- DigitalCommons@University of Nebraska - Lincoln (1)
- Diposit Digital de la UB - Universidade de Barcelona (2)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (16)
- Duke University (1)
- E-Research at Tennessee State University (1)
- Gallica, Bibliotheque Numerique - Bibliothèque nationale de France (French National Library) (BnF), France (11)
- Glasgow Theses Service (3)
- Illinois Digital Environment for Access to Learning and Scholarship Repository (2)
- Institute of Public Health in Ireland, Ireland (18)
- Instituto Politécnico do Porto, Portugal (17)
- Iowa Publications Online (IPO) - State Library, State of Iowa (Iowa), United States (8)
- Martin Luther Universitat Halle Wittenberg, Germany (3)
- Massachusetts Institute of Technology (2)
- Memoria Académica - FaHCE, UNLP - Argentina (5)
- Ministerio de Cultura, Spain (4)
- National Center for Biotechnology Information - NCBI (3)
- Nottingham eTheses (2)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (5)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (2)
- Repositório da Produção Científica e Intelectual da Unicamp (13)
- Repositorio de la Universidad de Cuenca (1)
- Repositório digital da Fundação Getúlio Vargas - FGV (4)
- Repositório Digital da UNIVERSIDADE DA MADEIRA - Portugal (1)
- Repositório Institucional da Universidade de Aveiro - Portugal (1)
- Repositório Institucional da Universidade de Brasília (1)
- Repositório Institucional da Universidade Federal do Rio Grande do Norte (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (17)
- Repositorio Institucional Universidad de Medellín (1)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (10)
- School of Medicine, Washington University, United States (1)
- Scielo Saúde Pública - SP (6)
- Scottish Institute for Research in Economics (SIRE) (SIRE), United Kingdom (1)
- SerWisS - Server für Wissenschaftliche Schriften der Fachhochschule Hannover (1)
- South Carolina State Documents Depository (4)
- Universidad Autónoma de Nuevo León, Mexico (1)
- Universidad de Alicante (8)
- Universidad del Rosario, Colombia (10)
- Universidad Politécnica de Madrid (13)
- Universidade Complutense de Madrid (2)
- Universidade do Minho (7)
- Universidade dos Açores - Portugal (1)
- Universidade Federal do Pará (1)
- Universidade Federal do Rio Grande do Norte (UFRN) (9)
- Universidade Metodista de São Paulo (1)
- Universidade Técnica de Lisboa (1)
- Universitat de Girona, Spain (21)
- Universitätsbibliothek Kassel, Universität Kassel, Germany (1)
- Université de Lausanne, Switzerland (106)
- Université de Montréal, Canada (22)
- University of Canberra Research Repository - Australia (1)
- University of Connecticut - USA (1)
- University of Michigan (4)
- University of Queensland eSpace - Australia (15)
- University of Southampton, United Kingdom (18)
Resumo:
Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.