16 resultados para statistical methods
Filtro por publicador
- Aberdeen University (1)
- Academic Archive On-line (Jönköping University; Sweden) (1)
- Academic Archive On-line (Stockholm University; Sweden) (1)
- Acceda, el repositorio institucional de la Universidad de Las Palmas de Gran Canaria. España (1)
- Adam Mickiewicz University Repository (1)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (16)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (4)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (4)
- Archive of European Integration (2)
- Aston University Research Archive (30)
- Avian Conservation and Ecology - Eletronic Cientific Hournal - Écologie et conservation des oiseaux: (2)
- Biblioteca de Teses e Dissertações da USP (2)
- Biblioteca Digital | Sistema Integrado de Documentación | UNCuyo - UNCUYO. UNIVERSIDAD NACIONAL DE CUYO. (2)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (22)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (39)
- Biodiversity Heritage Library, United States (1)
- Bioline International (1)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (24)
- Brock University, Canada (2)
- Bucknell University Digital Commons - Pensilvania - USA (1)
- Bulgarian Digital Mathematics Library at IMI-BAS (7)
- CentAUR: Central Archive University of Reading - UK (55)
- Cochin University of Science & Technology (CUSAT), India (13)
- Coffee Science - Universidade Federal de Lavras (2)
- Collection Of Biostatistics Research Archive (4)
- Comissão Econômica para a América Latina e o Caribe (CEPAL) (1)
- Consorci de Serveis Universitaris de Catalunya (CSUC), Spain (63)
- Cor-Ciencia - Acuerdo de Bibliotecas Universitarias de Córdoba (ABUC), Argentina (1)
- CORA - Cork Open Research Archive - University College Cork - Ireland (1)
- Corvinus Research Archive - The institutional repository for the Corvinus University of Budapest (3)
- CUNY Academic Works (2)
- Dalarna University College Electronic Archive (12)
- Digital Commons - Michigan Tech (4)
- Digital Commons @ DU | University of Denver Research (1)
- Digital Commons at Florida International University (10)
- Digital Repository at Iowa State University (1)
- DigitalCommons@The Texas Medical Center (26)
- DigitalCommons@University of Nebraska - Lincoln (2)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (71)
- DRUM (Digital Repository at the University of Maryland) (3)
- Duke University (5)
- FUNDAJ - Fundação Joaquim Nabuco (1)
- Galway Mayo Institute of Technology, Ireland (1)
- Georgian Library Association, Georgia (1)
- Glasgow Theses Service (1)
- Instituto Politécnico do Porto, Portugal (8)
- Iowa Publications Online (IPO) - State Library, State of Iowa (Iowa), United States (6)
- Lume - Repositório Digital da Universidade Federal do Rio Grande do Sul (1)
- National Center for Biotechnology Information - NCBI (6)
- Nottingham eTheses (1)
- Portal de Revistas Científicas Complutenses - Espanha (1)
- Publishing Network for Geoscientific & Environmental Data (12)
- QSpace: Queen's University - Canada (2)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (2)
- ReCiL - Repositório Científico Lusófona - Grupo Lusófona, Portugal (1)
- Repositório Aberto da Universidade Aberta de Portugal (1)
- REPOSITÓRIO ABERTO do Instituto Superior Miguel Torga - Portugal (1)
- Repositorio Académico de la Universidad Nacional de Costa Rica (1)
- Repositório Alice (Acesso Livre à Informação Científica da Embrapa / Repository Open Access to Scientific Information from Embrapa) (1)
- Repositório Científico da Universidade de Évora - Portugal (2)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (6)
- Repositório da Produção Científica e Intelectual da Unicamp (3)
- Repositório digital da Fundação Getúlio Vargas - FGV (6)
- Repositório Digital da UNIVERSIDADE DA MADEIRA - Portugal (1)
- REPOSITORIO DIGITAL IMARPE - INSTITUTO DEL MAR DEL PERÚ, Peru (1)
- Repositório do Centro Hospitalar de Lisboa Central, EPE - Centro Hospitalar de Lisboa Central, EPE, Portugal (2)
- Repositorio Institucional da UFLA (RIUFLA) (1)
- Repositório Institucional da Universidade Estadual de São Paulo - UNESP (2)
- Repositório Institucional da Universidade Tecnológica Federal do Paraná (RIUT) (3)
- Repositorio Institucional de la Universidad de Málaga (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (120)
- Repositorio Institucional Universidad EAFIT - Medelin - Colombia (1)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (13)
- Savoirs UdeS : plateforme de diffusion de la production intellectuelle de l’Université de Sherbrooke - Canada (2)
- Scielo España (1)
- Scielo Saúde Pública - SP (27)
- Scientific Open-access Literature Archive and Repository (1)
- Universidad de Alicante (4)
- Universidad del Rosario, Colombia (9)
- Universidad Politécnica de Madrid (22)
- Universidade Complutense de Madrid (1)
- Universidade de Lisboa - Repositório Aberto (1)
- Universidade do Minho (2)
- Universidade dos Açores - Portugal (1)
- Universidade Estadual Paulista "Júlio de Mesquita Filho" (UNESP) (1)
- Universidade Federal de Uberlândia (2)
- Universidade Federal do Pará (10)
- Universidade Federal do Rio Grande do Norte (UFRN) (8)
- Universidade Metodista de São Paulo (1)
- Universitat de Girona, Spain (8)
- Universitätsbibliothek Kassel, Universität Kassel, Germany (4)
- Université de Lausanne, Switzerland (65)
- Université de Montréal (1)
- Université de Montréal, Canada (54)
- Université Laval Mémoires et thèses électroniques (2)
- University of Canberra Research Repository - Australia (1)
- University of Michigan (20)
- University of Queensland eSpace - Australia (42)
- University of Washington (4)
Resumo:
Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.