10 resultados para Latent semantic indexing
em Duke University
Resumo:
This article examines the behavior of equity trading volume and volatility for the individual firms composing the Standard & Poor's 100 composite index. Using multivariate spectral methods, we find that fractionally integrated processes best describe the long-run temporal dependencies in both series. Consistent with a stylized mixture-of-distributions hypothesis model in which the aggregate "news"-arrival process possesses long-memory characteristics, the long-run hyperbolic decay rates appear to be common across each volume-volatility pair.
Resumo:
We develop a model for stochastic processes with random marginal distributions. Our model relies on a stick-breaking construction for the marginal distribution of the process, and introduces dependence across locations by using a latent Gaussian copula model as the mechanism for selecting the atoms. The resulting latent stick-breaking process (LaSBP) induces a random partition of the index space, with points closer in space having a higher probability of being in the same cluster. We develop an efficient and straightforward Markov chain Monte Carlo (MCMC) algorithm for computation and discuss applications in financial econometrics and ecology. This article has supplementary material online.
Resumo:
Tumor microenvironmental stresses, such as hypoxia and lactic acidosis, play important roles in tumor progression. Although gene signatures reflecting the influence of these stresses are powerful approaches to link expression with phenotypes, they do not fully reflect the complexity of human cancers. Here, we describe the use of latent factor models to further dissect the stress gene signatures in a breast cancer expression dataset. The genes in these latent factors are coordinately expressed in tumors and depict distinct, interacting components of the biological processes. The genes in several latent factors are highly enriched in chromosomal locations. When these factors are analyzed in independent datasets with gene expression and array CGH data, the expression values of these factors are highly correlated with copy number alterations (CNAs) of the corresponding BAC clones in both the cell lines and tumors. Therefore, variation in the expression of these pathway-associated factors is at least partially caused by variation in gene dosage and CNAs among breast cancers. We have also found the expression of two latent factors without any chromosomal enrichment is highly associated with 12q CNA, likely an instance of "trans"-variations in which CNA leads to the variations in gene expression outside of the CNA region. In addition, we have found that factor 26 (1q CNA) is negatively correlated with HIF-1alpha protein and hypoxia pathways in breast tumors and cell lines. This agrees with, and for the first time links, known good prognosis associated with both a low hypoxia signature and the presence of CNA in this region. Taken together, these results suggest the possibility that tumor segmental aneuploidy makes significant contributions to variation in the lactic acidosis/hypoxia gene signatures in human cancers and demonstrate that latent factor analysis is a powerful means to uncover such a linkage.
Resumo:
We discuss a general approach to dynamic sparsity modeling in multivariate time series analysis. Time-varying parameters are linked to latent processes that are thresholded to induce zero values adaptively, providing natural mechanisms for dynamic variable inclusion/selection. We discuss Bayesian model specification, analysis and prediction in dynamic regressions, time-varying vector autoregressions, and multivariate volatility models using latent thresholding. Application to a topical macroeconomic time series problem illustrates some of the benefits of the approach in terms of statistical and economic interpretations as well as improved predictions. Supplementary materials for this article are available online. © 2013 Copyright Taylor and Francis Group, LLC.
Resumo:
Learning multiple tasks across heterogeneous domains is a challenging problem since the feature space may not be the same for different tasks. We assume the data in multiple tasks are generated from a latent common domain via sparse domain transforms and propose a latent probit model (LPM) to jointly learn the domain transforms, and the shared probit classifier in the common domain. To learn meaningful task relatedness and avoid over-fitting in classification, we introduce sparsity in the domain transforms matrices, as well as in the common classifier. We derive theoretical bounds for the estimation error of the classifier in terms of the sparsity of domain transforms. An expectation-maximization algorithm is derived for learning the LPM. The effectiveness of the approach is demonstrated on several real datasets.
Resumo:
Researchers currently debate whether new semantic knowledge can be learned and retrieved despite extensive damage to medial temporal lobe (MTL) structures. The authors explored whether H. M., a patient with amnesia, could acquire new semantic information in the context of his lifelong hobby of solving crossword puzzles. First, H. M. was tested on a series of word-skills tests believed important in solving crosswords. He also completed 3 new crosswords: 1 puzzle testing pre-1953 knowledge, another testing post-1953 knowledge, and another combining the 2 by giving postoperative semantic clues for preoperative answers. From the results, the authors concluded that H. M. can acquire new semantic knowledge, at least temporarily, when he can anchor it to mental representations established preoperatively.
Resumo:
Undergraduates were asked to generate a name for a hypothetical new exemplar of a category. They produced names that had the same numbers of syllables, the same endings, and the same types of word stems as existing exemplars of that category. In addition, novel exemplars, each consisting of a nonsense syllable root and a prototypical ending, were accurately assigned to categories. The data demonstrate the abstraction and use of surface properties of words.
Resumo:
Cognitive neuroscience, as a discipline, links the biological systems studied by neuroscience to the processing constructs studied by psychology. By mapping these relations throughout the literature of cognitive neuroscience, we visualize the semantic structure of the discipline and point to directions for future research that will advance its integrative goal. For this purpose, network text analyses were applied to an exhaustive corpus of abstracts collected from five major journals over a 30-month period, including every study that used fMRI to investigate psychological processes. From this, we generate network maps that illustrate the relationships among psychological and anatomical terms, along with centrality statistics that guide inferences about network structure. Three terms--prefrontal cortex, amygdala, and anterior cingulate cortex--dominate the network structure with their high frequency in the literature and the density of their connections with other neuroanatomical terms. From network statistics, we identify terms that are understudied compared with their importance in the network (e.g., insula and thalamus), are underspecified in the language of the discipline (e.g., terms associated with executive function), or are imperfectly integrated with other concepts (e.g., subdisciplines like decision neuroscience that are disconnected from the main network). Taking these results as the basis for prescriptive recommendations, we conclude that semantic analyses provide useful guidance for cognitive neuroscience as a discipline, both by illustrating systematic biases in the conduct and presentation of research and by identifying directions that may be most productive for future research.
Resumo:
Despite knowing a familiar individual (such as a daughter) well, anecdotal evidence suggests that naming errors can occur among very familiar individuals. Here, we investigate the conditions surrounding these types of errors, or misnamings, in which a person (the misnamer) incorrectly calls a familiar individual (the misnamed) by someone else's name (the named). Across 5 studies including over 1,700 participants, we investigated the prevalence of the phenomenon of misnaming, identified factors underlying why it may occur, and tested potential mechanisms. We included undergraduates and MTurk workers and asked questions of both the misnamed and the misnamer. We find that familiar individuals are often misnamed with the name of another member of the same semantic category; family members are misnamed with another family member's name and friends are misnamed with another friend's name. Phonetic similarity between names also leads to misnamings; however, the size of this effect was smaller than that of the semantic category effect. Overall, the misnaming of familiar individuals is driven by the relationship between the misnamer, misnamed, and named; phonetic similarity between the incorrect name used by the misnamer and the correct name also plays a role in misnaming.