1 resultado para Superiority
Filtro por publicador
- Academic Archive On-line (Stockholm University; Sweden) (1)
- Academic Research Repository at Institute of Developing Economies (1)
- Acceda, el repositorio institucional de la Universidad de Las Palmas de Gran Canaria. España (1)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (5)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (1)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (3)
- Archive of European Integration (4)
- Aston University Research Archive (19)
- B-Digital - Universidade Fernando Pessoa - Portugal (1)
- Biblioteca de Teses e Dissertações da USP (1)
- Biblioteca Digital | Sistema Integrado de Documentación | UNCuyo - UNCUYO. UNIVERSIDAD NACIONAL DE CUYO. (1)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (8)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (21)
- Biblioteca Virtual del Sistema Sanitario Público de Andalucía (BV-SSPA), Junta de Andalucía. Consejería de Salud y Bienestar Social, Spain (1)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (66)
- Brock University, Canada (4)
- Bucknell University Digital Commons - Pensilvania - USA (1)
- CentAUR: Central Archive University of Reading - UK (25)
- Central European University - Research Support Scheme (1)
- Cochin University of Science & Technology (CUSAT), India (8)
- Consorci de Serveis Universitaris de Catalunya (CSUC), Spain (13)
- CORA - Cork Open Research Archive - University College Cork - Ireland (1)
- Dalarna University College Electronic Archive (6)
- DI-fusion - The institutional repository of Université Libre de Bruxelles (1)
- Digital Commons @ DU | University of Denver Research (1)
- Digital Commons at Florida International University (14)
- Digital Peer Publishing (3)
- DigitalCommons@The Texas Medical Center (5)
- DigitalCommons@University of Nebraska - Lincoln (2)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (16)
- eResearch Archive - Queensland Department of Agriculture; Fisheries and Forestry (1)
- FUNDAJ - Fundação Joaquim Nabuco (1)
- Illinois Digital Environment for Access to Learning and Scholarship Repository (1)
- Instituto Politécnico de Bragança (1)
- Instituto Politécnico de Santarém (2)
- Instituto Politécnico do Porto, Portugal (3)
- Iowa Publications Online (IPO) - State Library, State of Iowa (Iowa), United States (1)
- Massachusetts Institute of Technology (1)
- Memoria Académica - FaHCE, UNLP - Argentina (9)
- Ministerio de Cultura, Spain (1)
- Nottingham eTheses (3)
- Portal de Revistas Científicas Complutenses - Espanha (3)
- Portal do Conhecimento - Ministerio do Ensino Superior Ciencia e Inovacao, Cape Verde (1)
- Publishing Network for Geoscientific & Environmental Data (1)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (1)
- RCAAP - Repositório Científico de Acesso Aberto de Portugal (3)
- ReCiL - Repositório Científico Lusófona - Grupo Lusófona, Portugal (1)
- Repositório Alice (Acesso Livre à Informação Científica da Embrapa / Repository Open Access to Scientific Information from Embrapa) (2)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (2)
- Repositório Científico do Instituto Politécnico de Santarém - Portugal (2)
- Repositório da Produção Científica e Intelectual da Unicamp (2)
- Repositório digital da Fundação Getúlio Vargas - FGV (4)
- Repositório Institucional da Universidade de Brasília (2)
- Repositório Institucional da Universidade Federal do Rio Grande do Norte (1)
- Repositório Institucional da Universidade Tecnológica Federal do Paraná (RIUT) (3)
- Repositorio Institucional de la Universidad de La Laguna (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (108)
- Research Open Access Repository of the University of East London. (1)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (2)
- Scielo España (1)
- Scielo Saúde Pública - SP (28)
- The Scholarly Commons | School of Hotel Administration; Cornell University Research (1)
- Universidad de Alicante (1)
- Universidad del Rosario, Colombia (9)
- Universidad Politécnica de Madrid (8)
- Universidade Complutense de Madrid (2)
- Universidade de Lisboa - Repositório Aberto (1)
- Universidade do Minho (2)
- Universidade Federal do Pará (4)
- Universidade Federal do Rio Grande do Norte (UFRN) (17)
- Universidade Metodista de São Paulo (5)
- Universidade Técnica de Lisboa (1)
- Universita di Parma (1)
- Université de Lausanne, Switzerland (48)
- Université de Montréal, Canada (14)
- University of Connecticut - USA (3)
- University of Michigan (4)
- University of Queensland eSpace - Australia (20)
- Worcester Research and Publications - Worcester Research and Publications - UK (1)
Resumo:
With Tweet volumes reaching 500 million a day, sampling is inevitable for any application using Twitter data. Realizing this, data providers such as Twitter, Gnip and Boardreader license sampled data streams priced in accordance with the sample size. Big Data applications working with sampled data would be interested in working with a large enough sample that is representative of the universal dataset. Previous work focusing on the representativeness issue has considered ensuring the global occurrence rates of key terms, be reliably estimated from the sample. Present technology allows sample size estimation in accordance with probabilistic bounds on occurrence rates for the case of uniform random sampling. In this paper, we consider the problem of further improving sample size estimates by leveraging stratification in Twitter data. We analyze our estimates through an extensive study using simulations and real-world data, establishing the superiority of our method over uniform random sampling. Our work provides the technical know-how for data providers to expand their portfolio to include stratified sampled datasets, whereas applications are benefited by being able to monitor more topics/events at the same data and computing cost.