994 resultados para Topic modeling


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-07

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The amount of information contained within the Internet has exploded in recent decades. As more and more news, blogs, and many other kinds of articles that are published on the Internet, categorization of articles and documents are increasingly desired. Among the approaches to categorize articles, labeling is one of the most common method; it provides a relatively intuitive and effective way to separate articles into different categories. However, manual labeling is limited by its efficiency, even thought the labels selected manually have relatively high quality. This report explores the topic modeling approach of Online Latent Dirichlet Allocation (Online-LDA). Additionally, a method to automatically label articles with their latent topics by combining the Online-LDA posterior with a probabilistic automatic labeling algorithm is implemented. The goal of this report is to examine the accuracy of the labels generated automatically by a topic model and probabilistic relevance algorithm for a set of real-world, dynamically updated articles from an online Rich Site Summary (RSS) service.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This research falls in the area of enhancing the quality of tag-based item recommendation systems. It aims to achieve this by employing a multi-dimensional user profile approach and by analyzing the semantic aspects of tags. Tag-based recommender systems have two characteristics that need to be carefully studied in order to build a reliable system. Firstly, the multi-dimensional correlation, called as tag assignment , should be appropriately modelled in order to create the user profiles [1]. Secondly, the semantics behind the tags should be considered properly as the flexibility with their design can cause semantic problems such as synonymy and polysemy [2]. This research proposes to address these two challenges for building a tag-based item recommendation system by employing tensor modeling as the multi-dimensional user profile approach, and the topic model as the semantic analysis approach. The first objective is to optimize the tensor model reconstruction and to improve the model performance in generating quality rec-ommendation. A novel Tensor-based Recommendation using Probabilistic Ranking (TRPR) method [3] has been developed. Results show this method to be scalable for large datasets and outperforming the benchmarking methods in terms of accuracy. The memory efficient loop implements the n-mode block-striped (matrix) product for tensor reconstruction as an approximation of the initial tensor. The probabilistic ranking calculates the probabil-ity of users to select candidate items using their tag preference list based on the entries generated from the reconstructed tensor. The second objective is to analyse the tag semantics and utilize the outcome in building the tensor model. This research proposes to investigate the problem using topic model approach to keep the tags nature as the “social vocabulary” [4]. For the tag assignment data, topics can be generated from the occurrences of tags given for an item. However there is only limited amount of tags availa-ble to represent items as collection of topics, since an item might have only been tagged by using several tags. Consequently, the generated topics might not able to represent the items appropriately. Furthermore, given that each tag can belong to any topics with various probability scores, the occurrence of tags cannot simply be mapped by the topics to build the tensor model. A standard weighting technique will not appropriately calculate the value of tagging activity since it will define the context of an item using a tag instead of a topic.

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quantum theory has recently been employed to further advance the theory of information retrieval (IR). A challenging research topic is to investigate the so called quantum-like interference in users’ relevance judgement process, where users are involved to judge the relevance degree of each document with respect to a given query. In this process, users’ relevance judgement for the current document is often interfered by the judgement for previous documents, due to the interference on users’ cognitive status. Research from cognitive science has demonstrated some initial evidence of quantum-like cognitive interference in human decision making, which underpins the user’s relevance judgement process. This motivates us to model such cognitive interference in the relevance judgement process, which in our belief will lead to a better modeling and explanation of user behaviors in relevance judgement process for IR and eventually lead to more user-centric IR models. In this paper, we propose to use probabilistic automaton(PA) and quantum finite automaton (QFA), which are suitable to represent the transition of user judgement states, to dynamically model the cognitive interference when the user is judging a list of documents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Software development and Web site development techniques have evolved significantly over the past 20 years. The relatively young Web Application development area has borrowed heavily from traditional software development methodologies primarily due to the similarities in areas of data persistence and User Interface (UI) design. Recent developments in this area propose a new Web Modeling Language (WebML) to facilitate the nuances specific to Web development. WebML is one of a number of implementations designed to enable modeling of web site interaction flows while being extendable to accommodate new features in Web site development into the future. Our research aims to extend WebML with a focus on stigmergy which is a biological term originally used to describe coordination between insects. We see design features in existing Web sites that mimic stigmergic mechanisms as part of the UI. We believe that we can synthesize and embed stigmergy in Web 2.0 sites. This paper focuses on the sub-topic of site UI design and stigmergic mechanism designs required to achieve this.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The SimCalc Vision and Contributions Advances in Mathematics Education 2013, pp 419-436 Modeling as a Means for Making Powerful Ideas Accessible to Children at an Early Age Richard Lesh, Lyn English, Serife Sevis, Chanda Riggs … show all 4 hide » Look Inside » Get Access Abstract In modern societies in the 21st century, significant changes have been occurring in the kinds of “mathematical thinking” that are needed outside of school. Even in the case of primary school children (grades K-2), children not only encounter situations where numbers refer to sets of discrete objects that can be counted. Numbers also are used to describe situations that involve continuous quantities (inches, feet, pounds, etc.), signed quantities, quantities that have both magnitude and direction, locations (coordinates, or ordinal quantities), transformations (actions), accumulating quantities, continually changing quantities, and other kinds of mathematical objects. Furthermore, if we ask, what kind of situations can children use numbers to describe? rather than restricting attention to situations where children should be able to calculate correctly, then this study shows that average ability children in grades K-2 are (and need to be) able to productively mathematize situations that involve far more than simple counts. Similarly, whereas nearly the entire K-16 mathematics curriculum is restricted to situations that can be mathematized using a single input-output rule going in one direction, even the lives of primary school children are filled with situations that involve several interacting actions—and which involve feedback loops, second-order effects, and issues such as maximization, minimization, or stabilizations (which, many years ago, needed to be postponed until students had been introduced to calculus). …This brief paper demonstrates that, if children’s stories are used to introduce simulations of “real life” problem solving situations, then average ability primary school children are quite capable of dealing productively with 60-minute problems that involve (a) many kinds of quantities in addition to “counts,” (b) integrated collections of concepts associated with a variety of textbook topic areas, (c) interactions among several different actors, and (d) issues such as maximization, minimization, and stabilization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the most fundamental and widely accepted ideas in finance is that investors are compensated through higher returns for taking on non-diversifiable risk. Hence the quantification, modeling and prediction of risk have been, and still are one of the most prolific research areas in financial economics. It was recognized early on that there are predictable patterns in the variance of speculative prices. Later research has shown that there may also be systematic variation in the skewness and kurtosis of financial returns. Lacking in the literature so far, is an out-of-sample forecast evaluation of the potential benefits of these new more complicated models with time-varying higher moments. Such an evaluation is the topic of this dissertation. Essay 1 investigates the forecast performance of the GARCH (1,1) model when estimated with 9 different error distributions on Standard and Poor’s 500 Index Future returns. By utilizing the theory of realized variance to construct an appropriate ex post measure of variance from intra-day data it is shown that allowing for a leptokurtic error distribution leads to significant improvements in variance forecasts compared to using the normal distribution. This result holds for daily, weekly as well as monthly forecast horizons. It is also found that allowing for skewness and time variation in the higher moments of the distribution does not further improve forecasts. In Essay 2, by using 20 years of daily Standard and Poor 500 index returns, it is found that density forecasts are much improved by allowing for constant excess kurtosis but not improved by allowing for skewness. By allowing the kurtosis and skewness to be time varying the density forecasts are not further improved but on the contrary made slightly worse. In Essay 3 a new model incorporating conditional variance, skewness and kurtosis based on the Normal Inverse Gaussian (NIG) distribution is proposed. The new model and two previously used NIG models are evaluated by their Value at Risk (VaR) forecasts on a long series of daily Standard and Poor’s 500 returns. The results show that only the new model produces satisfactory VaR forecasts for both 1% and 5% VaR Taken together the results of the thesis show that kurtosis appears not to exhibit predictable time variation, whereas there is found some predictability in the skewness. However, the dynamic properties of the skewness are not completely captured by any of the models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis is a comprised of three different projects within the topic of tropical atmospheric dynamics. First, I analyze observations of thermal radiation from Saturn’s atmosphere and from them, determine the latitudinal distribution of ammonia vapor near the 1.5-bar pressure level. The most prominent feature of the observations is the high brightness temperature of Saturn’s subtropical latitudes on either side of the equator. After comparing the observations to a microwave radiative transfer model, I find that these subtropical bands require very low ammonia relative humidity below the ammonia cloud layer in order to achieve the high brightness temperatures observed. We suggest that these bright subtropical bands represent dry zones created by a meridionally overturning circulation.

Second, I use a dry atmospheric general circulation model to study equatorial superrotation in terrestrial atmospheres. A wide range of atmospheres are simulated by varying three parameters: the pole-equator radiative equilibrium temperature contrast, the convective lapse rate, and the planetary rotation rate. A scaling theory is developed that establishes conditions under which superrotation occurs in terrestrial atmospheres. The scaling arguments show that superrotation is favored when the off-equatorial baroclinicity and planetary rotation rates are low. Similarly, superrotation is favored when the convective heating strengthens, which may account for the superrotation seen in extreme global-warming simulations.

Third, I use a moist slab-ocean general circulation model to study the impact of a zonally-symmetric continent on the distribution of monsoonal precipitation. I show that adding a hemispheric asymmetry in surface heat capacity is sufficient to cause symmetry breaking in both the spatial and temporal distribution of precipitation. This spatial symmetry breaking can be understood from a large-scale energetic perspective, while the temporal symmetry breaking requires consideration of the dynamical response to the heat capacity asymmetry and the seasonal cycle of insolation. Interestingly, the idealized monsoonal precipitation bears resemblance to precipitation in the Indian monsoon sector, suggesting that this work may provide insight into the causes of the temporally asymmetric distribution of precipitation over southeast Asia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Moving ecosystem modeling from research to applications and operations has direct management relevance and will be integral to achieving the water quality and living resource goals of the 2010 Chesapeake Bay Executive Order. Yet despite decades of ecosystem modeling efforts of linking climate to water quality, plankton and fish, ecological models are rarely taken to the operational phase. In an effort to promote operational ecosystem modeling and ecological forecasting in Chesapeake Bay, a meeting was convened on this topic at the 2010 Chesapeake Modeling Symposium (May, 10-11). These presentations show that tremendous progress has been made over the last five years toward the development of operational ecological forecasting models, and that efforts in Chesapeake Bay are leading the way nationally. Ecological forecasts predict the impacts of chemical, biological, and physical changes on ecosystems, ecosystem components, and people. They have great potential to educate and inform not only ecosystem management, but also the outlook and opinion of the general public, for whom we manage coastal ecosystems. In the context of the Chesapeake Bay Executive Order, ecological forecasting can be used to identify favorable restoration sites, predict which sites and species will be viable under various climate scenarios, and predict the impact of a restoration project on water quality.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: With the global expansion of clinical trials and the expectations of the rise of the emerging economies known as BRICs (Brazil, Russia, India and China), the understanding of factors that affect the willingness to participate in clinical trials of patients from those countries assumes a central role in the future of health research. METHODS: We conducted a systematic review and meta-analysis (SRMA) of willingness to participate in clinical trials among Brazilian patients and then we compared it with Indian patients (with results of another SRMA previously conducted by our group) through a system dynamics model. RESULTS: Five studies were included in the SRMA of Brazilian patients. Our main findings are 1) the major motivation for Brazilian patients to participate in clinical trials is altruism, 2) monetary reimbursement is the least important factor motivating Brazilian patients, 3) the major barrier for Brazilian patients to not participate in clinical trials is the fear of side effects, and 4) Brazilian patients are more likely willing to participate in clinical trials than Indians. CONCLUSION: Our study provides important insights for investigators and sponsors for planning trials in Brazil (and India) in the future. Ignoring these results may lead to unnecessary fund/time spending. More studies are needed to validate our results and for better understanding of this poorly studied theme.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A core activity in information systems development involves understanding the
conceptual model of the domain that the information system supports. Any conceptual model is ultimately created using a conceptual-modeling (CM) grammar. Accordingly, just as high quality conceptual models facilitate high quality systems development, high quality CM grammars facilitate high quality conceptual modeling. This paper seeks to provide a new perspective on improving the quality of CM grammar semantics. For the past twenty years, the leading approach to this topic has drawn on ontological theory. However, the ontological approach captures just half of the story. It needs to be coupled with a logical approach. We show how ontological quality and logical quality interrelate and we outline three contributions of a logical approach: the ability to see familiar conceptualmodeling problems in simpler ways, the illumination of new problems, and the ability to prove the benefit of modifying CM grammars.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lignocellulosic biomass pretreatment and the subsequent thermal conversion processes to produce solid, liquid, and gas biofuels are attractive solutions for today's energy challenges. The structural study of the main components in biomass and their macromolecular complexes is an active and ongoing research topic worldwide. The interactions among the three main components, cellulose, hemicellulose, and lignin, are studied in this paper using electronic structure methods, and the study includes examining the hydrogen bond network of cellulose-hemicellulose systems and the covalent bond linkages of hemicellulose-lignin systems. Several methods (semiempirical, Hartree-Fock, and density functional theory) using different basis sets were evaluated. It was shown that theoretical calculations can be used to simulate small model structures representing wood components. By comparing calculation results with experimental data, it was concluded that B3LYP/6-31G is the most suitable basis set to describe the hydrogen bond system and B3LYP/6-31G(d,p) is the most suitable basis set to describe the covalent system of woody biomass. The choice of unit model has a much larger effect on hydrogen bonding within cellulose-hemicellulose system, whereas the model choice has a minimal effect on the covalent linkage in the hemicellulose-lignin system. © 2011 American Chemical Society.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A core activity in information systems development involves building a conceptual model of the domain that an information system is intended to support. Such models are created using a conceptual-modeling (CM) grammar. Just as high-quality conceptual models facilitate high-quality systems development, high-quality CM grammars facilitate high-quality conceptual modeling. This paper provides a new perspective on ways to improve the quality of the semantics of CM grammars. For many years, the leading approach to this topic has relied on ontological theory. We show, however, that the ontological approach captures only half the story. It needs to be coupled with a logical approach. We explain how the ontological quality and logical quality of CM grammars interrelate. Furthermore, we outline three contributions that a logical approach can make to evaluating the quality of CM grammars: a means of seeing some familiar conceptual-modeling problems in simpler ways; the illumination of new problems; and the ability to prove the benefit of modifying existing CM grammars in particular ways. We demonstrate these benefits in the context of the Entity-Relationship grammar. More generally, our paper opens up a new area of research with many opportunities for future research and practice.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2012