1 resultado para data reduction by factor analysis
em Duke University
Filtro por publicador
- Aberdeen University (2)
- Academic Archive On-line (Stockholm University; Sweden) (1)
- Academic Research Repository at Institute of Developing Economies (2)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (10)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (2)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (2)
- Archive of European Integration (7)
- Aston University Research Archive (17)
- Biblioteca de Teses e Dissertações da USP (2)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (11)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (62)
- Biblioteca Virtual del Sistema Sanitario Público de Andalucía (BV-SSPA), Junta de Andalucía. Consejería de Salud y Bienestar Social, Spain (7)
- Biodiversity Heritage Library, United States (1)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (35)
- Brock University, Canada (9)
- Bucknell University Digital Commons - Pensilvania - USA (2)
- Bulgarian Digital Mathematics Library at IMI-BAS (3)
- CaltechTHESIS (1)
- CentAUR: Central Archive University of Reading - UK (56)
- CiencIPCA - Instituto Politécnico do Cávado e do Ave, Portugal (1)
- Cochin University of Science & Technology (CUSAT), India (9)
- Collection Of Biostatistics Research Archive (2)
- Comissão Econômica para a América Latina e o Caribe (CEPAL) (5)
- Consorci de Serveis Universitaris de Catalunya (CSUC), Spain (44)
- Cor-Ciencia - Acuerdo de Bibliotecas Universitarias de Córdoba (ABUC), Argentina (1)
- Corvinus Research Archive - The institutional repository for the Corvinus University of Budapest (3)
- Dalarna University College Electronic Archive (9)
- Digital Commons - Michigan Tech (3)
- Digital Commons at Florida International University (11)
- DigitalCommons@The Texas Medical Center (6)
- DigitalCommons@University of Nebraska - Lincoln (2)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (29)
- DRUM (Digital Repository at the University of Maryland) (3)
- Duke University (1)
- Galway Mayo Institute of Technology, Ireland (1)
- Georgian Library Association, Georgia (1)
- Glasgow Theses Service (2)
- Illinois Digital Environment for Access to Learning and Scholarship Repository (2)
- Institutional Repository of Leibniz University Hannover (1)
- INSTITUTO DE PESQUISAS ENERGÉTICAS E NUCLEARES (IPEN) - Repositório Digital da Produção Técnico Científica - BibliotecaTerezine Arantes Ferra (1)
- Instituto Politécnico do Porto, Portugal (8)
- Iowa Publications Online (IPO) - State Library, State of Iowa (Iowa), United States (13)
- Martin Luther Universitat Halle Wittenberg, Germany (1)
- Memorial University Research Repository (1)
- Ministerio de Cultura, Spain (1)
- National Center for Biotechnology Information - NCBI (17)
- Nottingham eTheses (1)
- Publishing Network for Geoscientific & Environmental Data (96)
- QSpace: Queen's University - Canada (1)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (2)
- ReCiL - Repositório Científico Lusófona - Grupo Lusófona, Portugal (1)
- Repositório Aberto da Universidade Aberta de Portugal (1)
- Repositório Alice (Acesso Livre à Informação Científica da Embrapa / Repository Open Access to Scientific Information from Embrapa) (1)
- Repositório Científico da Universidade de Évora - Portugal (2)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (10)
- Repositório da Produção Científica e Intelectual da Unicamp (6)
- Repositório da Universidade Federal do Espírito Santo (UFES), Brazil (7)
- Repositório digital da Fundação Getúlio Vargas - FGV (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (61)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (15)
- SAPIENTIA - Universidade do Algarve - Portugal (2)
- School of Medicine, Washington University, United States (1)
- Scielo España (1)
- Scielo Saúde Pública - SP (59)
- Scottish Institute for Research in Economics (SIRE) (SIRE), United Kingdom (1)
- Universidad de Alicante (3)
- Universidad del Rosario, Colombia (5)
- Universidad Politécnica de Madrid (13)
- Universidade Complutense de Madrid (2)
- Universidade do Minho (10)
- Universidade dos Açores - Portugal (3)
- Universidade Federal do Pará (3)
- Universidade Federal do Rio Grande do Norte (UFRN) (2)
- Universitat de Girona, Spain (9)
- Universitätsbibliothek Kassel, Universität Kassel, Germany (4)
- Université de Lausanne, Switzerland (108)
- Université de Montréal, Canada (14)
- University of Michigan (33)
- University of Queensland eSpace - Australia (42)
- University of Southampton, United Kingdom (1)
- University of Washington (1)
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.