10 resultados para Multiple-trait model
em Aston University Research Archive
Resumo:
An interactive hierarchical Generative Topographic Mapping (HGTM) ¸iteHGTM has been developed to visualise complex data sets. In this paper, we build a more general visualisation system by extending the HGTM visualisation system in 3 directions: bf (1) We generalize HGTM to noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM) developed in ¸iteKabanpami. bf (2) We give the user a choice of initializing the child plots of the current plot in either em interactive, or em automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in ¸iteHGTM, whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of LTMs is employed. bf (3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualisation plots, since they can highlight the boundaries between data clusters. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a toy example and apply our system to three more complex real data sets.
Resumo:
The goal of this project was to investigate the neural correlates of reading impairment in dyslexia as hypothesised by the main theories – the phonological deficit, visual magnocellular deficit and cerebellar deficit theories, with emphasis on individual differences. This research took a novel approach by: 1) contrasting the predictions in one sample of participants with dyslexia (DPs); 2) using a multiple-case study (and between-group comparisons) to investigate differences in BOLD between each DP and the controls (CPs); 3) demonstrating a possible relationship between reading impairment and its hypothesised neural correlates by using fMRI and a reading task. The multiple-case study revealed that the neural correlates of reading in dyslexia in all cases are not in agreement with the predictions of a single theory. The results show striking individual differences - even, where the neural correlates of reading in two DPs are consistent with the same theory, the areas can differ. A DP can exhibit under-engagement in an area in word, but not in pseudoword reading and vice versa, demonstrating that underactivation in that area cannot be interpreted as a ‘developmental lesion’. Additional analyses revealed complex results. Within-group analyses between behavioural measures and BOLD showed correlations in the predicted regions, areas outside ROI, and lack of correlations in some predicted areas. Comparisons of subgroups which differed on Orthography Composite supported the MDT, but only for Words. The results suggest that phonological scores are not a sufficient predictor of the under-engagement of phonological areas during reading. DPs and CPs exhibited correlations between Purdue Pegboard Composite and BOLD in cerebellar areas only for Pseudowords. Future research into reading in dyslexia should use a more holistic approach, involving genetic and environmental factors, gene by environment interaction, and comorbidity with other disorders. It is argued that multidisciplinary research, within the multiple-deficit model holds significant promise here.
Resumo:
Recently, we have developed the hierarchical Generative Topographic Mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. In this paper, we propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest," whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets. © 2005 IEEE.
Resumo:
The use of the multiple indicators, multiple causes model to operationalize formative variables (the formative MIMIC model) is advocated in the methodological literature. Yet, contrary to popular belief, the formative MIMIC model does not provide a valid method of integrating formative variables into empirical studies and we recommend discarding it from formative models. Our arguments rest on the following observations. First, much formative variable literature appears to conceptualize a causal structure between the formative variable and its indicators which can be tested or estimated. We demonstrate that this assumption is illogical, that a formative variable is simply a researcher-defined composite of sub-dimensions, and that such tests and estimates are unnecessary. Second, despite this, researchers often use the formative MIMIC model as a means to include formative variables in their models and to estimate the magnitude of linkages between formative variables and their indicators. However, the formative MIMIC model cannot provide this information since it is simply a model in which a common factor is predicted by some exogenous variables—the model does not integrate within it a formative variable. Empirical results from such studies need reassessing, since their interpretation may lead to inaccurate theoretical insights and the development of untested recommendations to managers. Finally, the use of the formative MIMIC model can foster fuzzy conceptualizations of variables, particularly since it can erroneously encourage the view that a single focal variable is measured with formative and reflective indicators. We explain these interlinked arguments in more detail and provide a set of recommendations for researchers to consider when dealing with formative variables.
Resumo:
This thesis applies a hierarchical latent trait model system to a large quantity of data. The motivation for it was lack of viable approaches to analyse High Throughput Screening datasets which maybe include thousands of data points with high dimensions. High Throughput Screening (HTS) is an important tool in the pharmaceutical industry for discovering leads which can be optimised and further developed into candidate drugs. Since the development of new robotic technologies, the ability to test the activities of compounds has considerably increased in recent years. Traditional methods, looking at tables and graphical plots for analysing relationships between measured activities and the structure of compounds, have not been feasible when facing a large HTS dataset. Instead, data visualisation provides a method for analysing such large datasets, especially with high dimensions. So far, a few visualisation techniques for drug design have been developed, but most of them just cope with several properties of compounds at one time. We believe that a latent variable model (LTM) with a non-linear mapping from the latent space to the data space is a preferred choice for visualising a complex high-dimensional data set. As a type of latent variable model, the latent trait model can deal with either continuous data or discrete data, which makes it particularly useful in this domain. In addition, with the aid of differential geometry, we can imagine the distribution of data from magnification factor and curvature plots. Rather than obtaining the useful information just from a single plot, a hierarchical LTM arranges a set of LTMs and their corresponding plots in a tree structure. We model the whole data set with a LTM at the top level, which is broken down into clusters at deeper levels of t.he hierarchy. In this manner, the refined visualisation plots can be displayed in deeper levels and sub-clusters may be found. Hierarchy of LTMs is trained using expectation-maximisation (EM) algorithm to maximise its likelihood with respect to the data sample. Training proceeds interactively in a recursive fashion (top-down). The user subjectively identifies interesting regions on the visualisation plot that they would like to model in a greater detail. At each stage of hierarchical LTM construction, the EM algorithm alternates between the E- and M-step. Another problem that can occur when visualising a large data set is that there may be significant overlaps of data clusters. It is very difficult for the user to judge where centres of regions of interest should be put. We address this problem by employing the minimum message length technique, which can help the user to decide the optimal structure of the model. In this thesis we also demonstrate the applicability of the hierarchy of latent trait models in the field of document data mining.
Resumo:
Early detection of glaucoma relies on a detailed knowledge of how the normal optic nerve (ONH) varies within the population. The purpose of this study focused on two main areas; 1. To explore the optic nerve head appearance in the normal optometric population and compare the south Asian (principally Pakistani) with the European white population, correcting for possible ocular and non-ocular influences in a multiple regression model. The main findings were: • The optic discs of the South Asian (SA) and White European (WE) populations were not statistically different in size. The SA group possessed discs with increased cupping and thinner neuro-retinal rims (NRR) compared with the WE group. The SA group also demonstrated a more vertically oval shape than the WE population. These differences were significant at the p<0.01 level. • The upper limits of inter-eye asymmetry were: ≤0.2 for cup to disc area ratio, and 3mmHg for intra-ocular pressure (IOP) for both ethnic groups and this did not increase with age. IOP asymmetry did not vary with gender, ethnicity or a family history of glaucoma and was independent of ONH asymmetry. ONH and IOP asymmetry are therefore independent risk factors when screening for glaucoma for both ethnic groups. 2. To investigate the validity of the ISNT rule: inferior> superior> nasal> temporal NRR thickness in the optometric population. The main findings were: • As disc size increased the disc become rounder and less vertically oval in shape. Vertically oval discs had thicker superior and inferior NRRs and thinner nasal and temporal NRRs compared with rounder disc shapes due to cup shape being independent of disc shape. Vertically oval discs were therefore more likely to obey the ISNT rule than larger rounder discs. • The ISNT rule has a low adherence in our sample of normal eyes (5.7%). However, by removing the nasal sector to become the IST rule, 74.5% of normal eyes obeyed. SA eyes and female gender were more likely to obey the ISNT rule due to increased disc ovality. The IST rule is independent of disc shape and therefore more suitable for assessing discs from both ethnic backgrounds. Obeying the ISNT rule or IST rule was not related to disc or cup size.
Resumo:
Purpose: The aims of this study were to develop an algorithm to accurately quantify Vigabatrin (VGB)-induced central visual field loss and to investigate the relationship between visual field loss and maximum daily dose, cumulative dose and duration of dose. Methods: The sample comprised 31 patients (mean age 37.9 years; SD 14.4 years) diagnosed with epilepsy and exposed to VGB. Each participant underwent standard automated static visual field examination of the central visual field. Central visual field loss was determined using continuous scales quantifying severity in terms of area and depth of defect and additionally by symmetry of defect between the two eyes. A simultaneous multiple regression model was used to explore the relationship between these visual field parameters and the drug predictor variables. Results: The regression model indicated that maximum VGB dose was the only factor to be significantly correlated with individual eye severity (right eye: p = 0.020; left eye: p = 0.012) and symmetry of visual field defect (p = 0.024). Conclusions: Maximum daily dose was the single most reliable indicator of those patients likely to exhibit visual field defects due to VGB. These findings suggest that high maximum dose is more likely to result in visual field defects than high cumulative doses or those of long duration.
Resumo:
Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in order to explore high dimensional biological dataset by visualising them in a low-dimensional space. With increasing dimensionality, some existing data visualisation methods such as generative topographic mapping (GTM) become computationally intractable. We propose variants of these methods, where we use log-transformations at certain steps of expectation maximisation (EM) based parameter learning process, to make them tractable for high-dimensional datasets. We demonstrate these proposed variants both for synthetic and electrostatic potential dataset of MHC class-I. We also propose to extend a latent trait model (LTM), suitable for visualising high dimensional discrete data, to simultaneously estimate feature saliency as an integrated part of the parameter learning process of a visualisation model. This LTM variant not only gives better visualisation by modifying the project map based on feature relevance, but also helps users to assess the significance of each feature. Another problem which is not addressed much in the literature is the visualisation of mixed-type data. We propose to combine GTM and LTM in a principled way where appropriate noise models are used for each type of data in order to visualise mixed-type data in a single plot. We call this model a generalised GTM (GGTM). We also propose to extend GGTM model to estimate feature saliencies while training a visualisation model and this is called GGTM with feature saliency (GGTM-FS). We demonstrate effectiveness of these proposed models both for synthetic and real datasets. We evaluate visualisation quality using quality metrics such as distance distortion measure and rank based measures: trustworthiness, continuity, mean relative rank errors with respect to data space and latent space. In cases where the labels are known we also use quality metrics of KL divergence and nearest neighbour classifications error in order to determine the separation between classes. We demonstrate the efficacy of these proposed models both for synthetic and real biological datasets with a main focus on the MHC class-I dataset.
Resumo:
This paper presents an effective decision making system for leak detection based on multiple generalized linear models and clustering techniques. The training data for the proposed decision system is obtained by setting up an experimental pipeline fully operational distribution system. The system is also equipped with data logging for three variables; namely, inlet pressure, outlet pressure, and outlet flow. The experimental setup is designed such that multi-operational conditions of the distribution system, including multi pressure and multi flow can be obtained. We then statistically tested and showed that pressure and flow variables can be used as signature of leak under the designed multi-operational conditions. It is then shown that the detection of leakages based on the training and testing of the proposed multi model decision system with pre data clustering, under multi operational conditions produces better recognition rates in comparison to the training based on the single model approach. This decision system is then equipped with the estimation of confidence limits and a method is proposed for using these confidence limits for obtaining more robust leakage recognition results.
Resumo:
Recent theoretical investigations have demonstrated that the stability of mode-locked solutions of multiple frequency channels depends on the degree of inhomogeneity in gain saturation. In this article, these results are generalized to determine conditions on each of the system parameters necessary for both the stability and the existence of mode-locked pulse solutions for an arbitrary number of frequency channels. In particular, we find that the parameters governing saturable intensity discrimination and gain inhomogeneity in the laser cavity also determine the position of bifurcations of solution types. These bifurcations are completely characterized in terms of these parameters. In addition to influencing the stability of mode-locked solutions, we determine a balance between cubic gain and quintic loss, which is necessary for the existence of solutions as well. Furthermore, we determine the critical degree of inhomogeneous gain broadening required to support pulses in multiple-frequency channels. © 2010 The American Physical Society.