932 resultados para Probabilistic latent semantic model
Resumo:
Recurrent wheezing or asthma is a common problem in children that has increased considerably in prevalence in the past few decades. The causes and underlying mechanisms are poorly understood and it is thought that a numb er of distinct diseases causing similar symptoms are involved. Due to the lack of a biologically founded classification system, children are classified according to their observed disease related features (symptoms, signs, measurements) into phenotypes. The objectives of this PhD project were a) to develop tools for analysing phenotypic variation of a disease, and b) to examine phenotypic variability of wheezing among children by applying these tools to existing epidemiological data. A combination of graphical methods (multivariate co rrespondence analysis) and statistical models (latent variables models) was used. In a first phase, a model for discrete variability (latent class model) was applied to data on symptoms and measurements from an epidemiological study to identify distinct phenotypes of wheezing. In a second phase, the modelling framework was expanded to include continuous variability (e.g. along a severity gradient) and combinations of discrete and continuo us variability (factor models and factor mixture models). The third phase focused on validating the methods using simulation studies. The main body of this thesis consists of 5 articles (3 published, 1 submitted and 1 to be submitted) including applications, methodological contributions and a review. The main findings and contributions were: 1) The application of a latent class model to epidemiological data (symptoms and physiological measurements) yielded plausible pheno types of wheezing with distinguishing characteristics that have previously been used as phenotype defining characteristics. 2) A method was proposed for including responses to conditional questions (e.g. questions on severity or triggers of wheezing are asked only to children with wheeze) in multivariate modelling.ii 3) A panel of clinicians was set up to agree on a plausible model for wheezing diseases. The model can be used to generate datasets for testing the modelling approach. 4) A critical review of methods for defining and validating phenotypes of wheeze in children was conducted. 5) The simulation studies showed that a parsimonious parameterisation of the models is required to identify the true underlying structure of the data. The developed approach can deal with some challenges of real-life cohort data such as variables of mixed mode (continuous and categorical), missing data and conditional questions. If carefully applied, the approach can be used to identify whether the underlying phenotypic variation is discrete (classes), continuous (factors) or a combination of these. These methods could help improve precision of research into causes and mechanisms and contribute to the development of a new classification of wheezing disorders in children and other diseases which are difficult to classify.
Resumo:
The purpose of this research was to better understand the impact of the terrorist attacks in 2001 on public health, particularly for Texas public health. This study employed mixed methods to examine changes to public health culture within Texas local public health agencies, important attitudes of public health workers toward responding to a disaster, and the funding policies that might ensure our investment in public health emergency preparedness is protected. ^ A qualitative analysis of interviews conducted with a large sample of public health officials in Texas found that all the constituent parts of a peculiar culture for public health preparedness existed that spanned the state's local health departments regardless of size, or funding level. The new preparedness culture in Texas had the hallmarks necessary for a robust public health preparedness and emergency response system. ^ The willingness of public health workers, necessary to make these kinds of changes and mount a disaster response was examined in one of Texas' most experienced disaster response teams—the public health workers for the City of Houston. A hypothesized latent variable model showed that willingness mediated all other factors in the model (self-efficacy, knowledge, barriers, and risk perception) for self-reported likelihood of reporting to work for a disaster. The RMSEA for the final model was 0.042 with a confidence interval of 0.036—0.049 and the chi-squared difference test was P=0.08, indicating a well-fitted model that suggests willingness is an important factor for consideration by preparedness planners and researchers alike. ^ Finally, with disasters on the rise and federal funding for preparedness dwindling, a review of states' policies for the distribution of these funds and their advantages and disadvantages were examined through a review of current literature and public documents, and a survey of state-level public health officials, emergency management professionals and researchers. Although the base plus per-capita method is the most common, it is not necessarily perceived to be the most effective. No clear "optimal" method emerged from the study, but recommendations for a strategic combination of three methods were made that has the potential to maximize the benefits of each method, while minimizing the weaknesses.^
Resumo:
There are several different standardised and widespread formats to represent emotions. However, there is no standard semantic model yet. This paper presents a new ontology, called Onyx, that aims to become such a standard while adding concepts from the latest Semantic Web models. In particular, the ontology focuses on the representation of Emotion Analysis results. But the model is abstract and inherits from previous standards and formats. It can thus be used as a reference representation of emotions in any future application or ontology. To prove this, we have translated resources from EmotionML representation to Onyx. We also present several ways in which developers could benefit from using this ontology instead of an ad-hoc presentation. Our ultimate goal is to foster the use of semantic technologies for emotion Analysis while following the Linked Data ideals.
USO DE TEORIAS NO CAMPO DE SISTEMAS DE INFORMAÇÃO: MAPEAMENTO USANDO TÉCNICAS DE MINERAÇÃO DE TEXTOS
Resumo:
Esta dissertação visa apresentar o mapeamento do uso das teorias de sistemas de informações, usando técnicas de recuperação de informação e metodologias de mineração de dados e textos. As teorias abordadas foram Economia de Custos de Transações (Transactions Costs Economics TCE), Visão Baseada em Recursos da Firma (Resource-Based View-RBV) e Teoria Institucional (Institutional Theory-IT), sendo escolhidas por serem teorias de grande relevância para estudos de alocação de investimentos e implementação em sistemas de informação, tendo como base de dados o conteúdo textual (em inglês) do resumo e da revisão teórica dos artigos dos periódicos Information System Research (ISR), Management Information Systems Quarterly (MISQ) e Journal of Management Information Systems (JMIS) no período de 2000 a 2008. Os resultados advindos da técnica de mineração textual aliada à mineração de dados foram comparadas com a ferramenta de busca avançada EBSCO e demonstraram uma eficiência maior na identificação de conteúdo. Os artigos fundamentados nas três teorias representaram 10% do total de artigos dos três períodicos e o período mais profícuo de publicação foi o de 2001 e 2007.(AU)
USO DE TEORIAS NO CAMPO DE SISTEMAS DE INFORMAÇÃO: MAPEAMENTO USANDO TÉCNICAS DE MINERAÇÃO DE TEXTOS
Resumo:
Esta dissertação visa apresentar o mapeamento do uso das teorias de sistemas de informações, usando técnicas de recuperação de informação e metodologias de mineração de dados e textos. As teorias abordadas foram Economia de Custos de Transações (Transactions Costs Economics TCE), Visão Baseada em Recursos da Firma (Resource-Based View-RBV) e Teoria Institucional (Institutional Theory-IT), sendo escolhidas por serem teorias de grande relevância para estudos de alocação de investimentos e implementação em sistemas de informação, tendo como base de dados o conteúdo textual (em inglês) do resumo e da revisão teórica dos artigos dos periódicos Information System Research (ISR), Management Information Systems Quarterly (MISQ) e Journal of Management Information Systems (JMIS) no período de 2000 a 2008. Os resultados advindos da técnica de mineração textual aliada à mineração de dados foram comparadas com a ferramenta de busca avançada EBSCO e demonstraram uma eficiência maior na identificação de conteúdo. Os artigos fundamentados nas três teorias representaram 10% do total de artigos dos três períodicos e o período mais profícuo de publicação foi o de 2001 e 2007.(AU)
Resumo:
This dissertation examines the role of topic knowledge (TK) in comprehension among typical readers and those with Specifically Poor Comprehension (SPC), i.e., those who demonstrate deficits in understanding what they read despite adequate decoding. Previous studies of poor comprehension have focused on weaknesses in specific skills, such as word decoding and inferencing ability, but this dissertation examined a different factor: whether deficits in availability and use of TK underlie poor comprehension. It is well known that TK tends to facilitate comprehension among typical readers, but its interaction with working memory and word decoding is unclear, particularly among participants with deficits in these skills. Across several passages, we found that SPCs do in fact have less TK to assist their interpretation of a text. However, we found no evidence that deficits in working memory or word decoding ability make it difficult for children to benefit from their TK when they have it. Instead, children across the skill spectrum are able to draw upon TK to assist their interpretation of a passage. Because TK is difficult to assess and studies vary in methodology, another goal of this dissertation was to compare two methods for measuring it. Both approaches score responses to a concept question to assess TK, but in the first, a human rater assigns a score whereas in the second, a computer algorithm, Latent Semantic Analysis (LSA; Landauer & Dumais, 1997) assigns a score. We found similar results across both methods of assessing TK, suggesting that a continuous measure is not appreciably more sensitive to variations in knowledge than discrete human ratings. This study contributes to our understanding of how best to measure TK, the factors that moderate its relationship with recall, and its role in poor comprehension. The findings suggest that teaching practices that focus on expanding TK are likely to improve comprehension across readers with a variety of abilities.
Resumo:
The present is marked by the availability of large volumes of heterogeneous data, whose management is extremely complex. While the treatment of factual data has been widely studied, the processing of subjective information still poses important challenges. This is especially true in tasks that combine Opinion Analysis with other challenges, such as the ones related to Question Answering. In this paper, we describe the different approaches we employed in the NTCIR 8 MOAT monolingual English (opinionatedness, relevance, answerness and polarity) and cross-lingual English-Chinese tasks, implemented in our OpAL system. The results obtained when using different settings of the system, as well as the error analysis performed after the competition, offered us some clear insights on the best combination of techniques, that balance between precision and recall. Contrary to our initial intuitions, we have also seen that the inclusion of specialized Natural Language Processing tools dealing with Temporality or Anaphora Resolution lowers the system performance, while the use of topic detection techniques using faceted search with Wikipedia and Latent Semantic Analysis leads to satisfactory system performance, both for the monolingual setting, as well as in a multilingual one.
Resumo:
Based on clues from epidemiology, low prenatal vitamin D has been proposed as a candidate risk factor for schizophrenia. Recent animal experiments have demonstrated that transient prenatal vitamin D deficiency is associated with persistent alterations in brain morphology and neurotrophin expression. In order to explore the utility of the vitamin D animal model of schizophrenia, we examined different types of learning and memory in adult rats exposed to transient prenatal vitamin D deficiency. Compared to control animals, the prenatally deplete animals had a significant impairment of latent inhibition, a feature often associated with schizophrenia. In addition, the deplete group was (a) significantly impaired on hole board habituation and (b) significantly better at maintaining previously learnt rules of brightness discrimination in a Y-chamber. In contrast, the prenatally deplete animals showed no impairment on the spatial learning task in the radial maze, nor on two-way active avoidance learning in the shuttle-box. The results indicate that transient prenatal vitamin D depletion in the rat is associated with subtle and discrete alterations in learning and memory. The behavioural phenotype associated with this animal model may provide insights into the neurobiological correlates of the cognitive impairments of schizophrenia. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Networks exhibiting accelerating growth have total link numbers growing faster than linearly with network size and either reach a limit or exhibit graduated transitions from nonstationary-to-stationary statistics and from random to scale-free to regular statistics as the network size grows. However, if for any reason the network cannot tolerate such gross structural changes then accelerating networks are constrained to have sizes below some critical value. This is of interest as the regulatory gene networks of single-celled prokaryotes are characterized by an accelerating quadratic growth and are size constrained to be less than about 10,000 genes encoded in DNA sequence of less than about 10 megabases. This paper presents a probabilistic accelerating network model for prokaryotic gene regulation which closely matches observed statistics by employing two classes of network nodes (regulatory and non-regulatory) and directed links whose inbound heads are exponentially distributed over all nodes and whose outbound tails are preferentially attached to regulatory nodes and described by a scale-free distribution. This model explains the observed quadratic growth in regulator number with gene number and predicts an upper prokaryote size limit closely approximating the observed value. (c) 2005 Elsevier GmbH. All rights reserved.
USO DE TEORIAS NO CAMPO DE SISTEMAS DE INFORMAÇÃO: MAPEAMENTO USANDO TÉCNICAS DE MINERAÇÃO DE TEXTOS
Resumo:
Esta dissertação visa apresentar o mapeamento do uso das teorias de sistemas de informações, usando técnicas de recuperação de informação e metodologias de mineração de dados e textos. As teorias abordadas foram Economia de Custos de Transações (Transactions Costs Economics TCE), Visão Baseada em Recursos da Firma (Resource-Based View-RBV) e Teoria Institucional (Institutional Theory-IT), sendo escolhidas por serem teorias de grande relevância para estudos de alocação de investimentos e implementação em sistemas de informação, tendo como base de dados o conteúdo textual (em inglês) do resumo e da revisão teórica dos artigos dos periódicos Information System Research (ISR), Management Information Systems Quarterly (MISQ) e Journal of Management Information Systems (JMIS) no período de 2000 a 2008. Os resultados advindos da técnica de mineração textual aliada à mineração de dados foram comparadas com a ferramenta de busca avançada EBSCO e demonstraram uma eficiência maior na identificação de conteúdo. Os artigos fundamentados nas três teorias representaram 10% do total de artigos dos três períodicos e o período mais profícuo de publicação foi o de 2001 e 2007.(AU)
Resumo:
Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper we introduce a form of non-linear latent variable model called the Generative Topographic Mapping, for which the parameters of the model can be determined using the EM algorithm. GTM provides a principled alternative to the widely used Self-Organizing Map (SOM) of Kohonen (1982), and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multi-phase oil pipeline.
Resumo:
The Self-Organizing Map (SOM) algorithm has been extensively studied and has been applied with considerable success to a wide variety of problems. However, the algorithm is derived from heuristic ideas and this leads to a number of significant limitations. In this paper, we consider the problem of modelling the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. We introduce a novel form of latent variable model, which we call the GTM algorithm (for Generative Topographic Mapping), which allows general non-linear transformations from latent space to data space, and which is trained using the EM (expectation-maximization) algorithm. Our approach overcomes the limitations of the SOM, while introducing no significant disadvantages. We demonstrate the performance of the GTM algorithm on simulated data from flow diagnostics for a multi-phase oil pipeline.
Resumo:
Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper we introduce a form of non-linear latent variable model called the Generative Topographic Mapping, for which the parameters of the model can be determined using the EM algorithm. GTM provides a principled alternative to the widely used Self-Organizing Map (SOM) of Kohonen (1982), and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multi-phase oil pipeline.
Resumo:
Bove, Pervan, Beatty, and Shiu [Bove, LL, Pervan, SJ, Beatty, SE, Shiu, E. Service worker role in encouraging customer organizational citizenship behaviors. J Bus Res 2009;62(7):698–705.] develop and test a latent variable model of the role of service workers in encouraging customers' organizational citizenship behaviors. However, Bove et al. [Bove, LL, Pervan, SJ, Beatty, SE, Shiu, E. Service worker role in encouraging customer organizational citizenship behaviors. J Bus Res 2009;62(7):698–705.] claim support for hypothesized relationships between constructs that, due to insufficient discriminant validity regarding certain constructs, may be inaccurate. This research comment discusses what discriminant validity represents, procedures for establishing discriminant validity, and presents an example of inaccurate discriminant validity assessment based upon the work of Bove et al. [Bove, LL, Pervan, SJ, Beatty, SE, Shiu, E. Service worker role in encouraging customer organizational citizenship behaviors. J Bus Res 2009;62(7):698–705.]. Solutions to discriminant validity problems and a five-step procedure for assessing discriminant validity then conclude the paper. This comment hopes to motivate a review of discriminant validity issues and offers assistance to future researchers conducting latent variable analysis.
Resumo:
There is an increasing emphasis on the use of software to control safety critical plants for a wide area of applications. The importance of ensuring the correct operation of such potentially hazardous systems points to an emphasis on the verification of the system relative to a suitably secure specification. However, the process of verification is often made more complex by the concurrency and real-time considerations which are inherent in many applications. A response to this is the use of formal methods for the specification and verification of safety critical control systems. These provide a mathematical representation of a system which permits reasoning about its properties. This thesis investigates the use of the formal method Communicating Sequential Processes (CSP) for the verification of a safety critical control application. CSP is a discrete event based process algebra which has a compositional axiomatic semantics that supports verification by formal proof. The application is an industrial case study which concerns the concurrent control of a real-time high speed mechanism. It is seen from the case study that the axiomatic verification method employed is complex. It requires the user to have a relatively comprehensive understanding of the nature of the proof system and the application. By making a series of observations the thesis notes that CSP possesses the scope to support a more procedural approach to verification in the form of testing. This thesis investigates the technique of testing and proposes the method of Ideal Test Sets. By exploiting the underlying structure of the CSP semantic model it is shown that for certain processes and specifications the obligation of verification can be reduced to that of testing the specification over a finite subset of the behaviours of the process.