Biblioteca Digital

25 resultados para Semi-complete Data Synchronization

em Aston University Research Archive

A hierarchical latent variable model for data visualization

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

Exploratory database visualisation: the application and assessment of data and dimensionality reduction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis describes the development of a complete data visualisation system for large tabular databases, such as those commonly found in a business environment. A state-of-the-art 'cyberspace cell' data visualisation technique was investigated and a powerful visualisation system using it was implemented. Although allowing databases to be explored and conclusions drawn, it had several drawbacks, the majority of which were due to the three-dimensional nature of the visualisation. A novel two-dimensional generic visualisation system, known as MADEN, was then developed and implemented, based upon a 2-D matrix of 'density plots'. MADEN allows an entire high-dimensional database to be visualised in one window, while permitting close analysis in 'enlargement' windows. Selections of records can be made and examined, and dependencies between fields can be investigated in detail. MADEN was used as a tool for investigating and assessing many data processing algorithms, firstly data-reducing (clustering) methods, then dimensionality-reducing techniques. These included a new 'directed' form of principal components analysis, several novel applications of artificial neural networks, and discriminant analysis techniques which illustrated how groups within a database can be separated. To illustrate the power of the system, MADEN was used to explore customer databases from two financial institutions, resulting in a number of discoveries which would be of interest to a marketing manager. Finally, the database of results from the 1992 UK Research Assessment Exercise was analysed. Using MADEN allowed both universities and disciplines to be graphically compared, and supplied some startling revelations, including empirical evidence of the 'Oxbridge factor'.

Semi-supervised learning of hierarchical latent trait models for data visualisation

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An interactive hierarchical Generative Topographic Mapping (HGTM) ¸iteH_GTM has been developed to visualise complex data sets. In this paper, we build a more general visualisation system by extending the HGTM visualisation system in 3 directions: bf (1) We generalize HGTM to noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM) developed in ¸iteKaban_pami. bf (2) We give the user a choice of initializing the child plots of the current plot in either em interactive, or em automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in ¸iteH_GTM, whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of LTMs is employed. bf (3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualisation plots, since they can highlight the boundaries between data clusters. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a toy example and apply our system to three more complex real data sets.

A semi-oriented radial measure for measuring the efficiency of decision making units with negative data, using DEA

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Data Envelopment Analysis (DEA) is a nonparametric method for measuring the efficiency of a set of decision making units such as firms or public sector agencies, first introduced into the operational research and management science literature by Charnes, Cooper, and Rhodes (CCR) [Charnes, A., Cooper, W.W., Rhodes, E., 1978. Measuring the efficiency of decision making units. European Journal of Operational Research 2, 429–444]. The original DEA models were applicable only to technologies characterized by positive inputs/outputs. In subsequent literature there have been various approaches to enable DEA to deal with negative data. In this paper, we propose a semi-oriented radial measure, which permits the presence of variables which can take both negative and positive values. The model is applied to data on a notional effluent processing system to compare the results with those yielded by two alternative methods for dealing with negative data in DEA: The modified slacks-based model suggested by Sharp et al. [Sharp, J.A., Liu, W.B., Meng, W., 2006. A modified slacks-based measure model for data envelopment analysis with ‘natural’ negative outputs and inputs. Journal of Operational Research Society 57 (11) 1–6] and the range directional model developed by Portela et al. [Portela, M.C.A.S., Thanassoulis, E., Simpson, G., 2004. A directional distance approach to deal with negative data in DEA: An application to bank branches. Journal of Operational Research Society 55 (10) 1111–1121]. A further example explores the advantages of using the new model.

A modified semi-oriented radial measure for target setting with negative data

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Over the last few years Data Envelopment Analysis (DEA) has been gaining increasing popularity as a tool for measuring efficiency and productivity of Decision Making Units (DMUs). Conventional DEA models assume non-negative inputs and outputs. However, in many real applications, some inputs and/or outputs can take negative values. Recently, Emrouznejad et al. [6] introduced a Semi-Oriented Radial Measure (SORM) for modelling DEA with negative data. This paper points out some issues in target setting with SORM models and introduces a modified SORM approach. An empirical study in bank sector demonstrates the applicability of the proposed model. © 2014 Elsevier Ltd. All rights reserved.

Semi-supervised construction of general visualization hierarchies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have recently developed a principled approach to interactive non-linear hierarchical visualization [8] based on the Generative Topographic Mapping (GTM). Hierarchical plots are needed when a single visualization plot is not sufficient (e.g. when dealing with large quantities of data). In this paper we extend our system by giving the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in [8], whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of GTMs is used. The latter is particularly useful when the plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a data set of 2300 18-dimensional points and mention extension of our system to accommodate discrete data types.

Lexical database enrichment through semi-automated morphological analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.

Synchronization processes in nonlinear systems and their relation to cortical oscillatory dynamics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis was focused on theoretical models of synchronization to cortical dynamics as measured by magnetoencephalography (MEG). Dynamical systems theory was used in both identifying relevant variables for brain coordination and also in devising methods for their quantification. We presented a method for studying interactions of linear and chaotic neuronal sources using MEG beamforming techniques. We showed that such sources can be accurately reconstructed in terms of their location, temporal dynamics and possible interactions. Synchronization in low-dimensional nonlinear systems was studied to explore specific correlates of functional integration and segregation. In the case of interacting dissimilar systems, relevant coordination phenomena involved generalized and phase synchronization, which were often intermittent. Spatially-extended systems were then studied. For locally-coupled dissimilar systems, as in the case of cortical columns, clustering behaviour occurred. Synchronized clusters emerged at different frequencies and their boundaries were marked through oscillation death. The macroscopic mean field revealed sharp spectral peaks at the frequencies of the clusters and broader spectral drops at their boundaries. These results question existing models of Event Related Synchronization and Desynchronization. We re-examined the concept of the steady-state evoked response following an AM stimulus. We showed that very little variability in the AM following response could be accounted by system noise. We presented a methodology for detecting local and global nonlinear interactions from MEG data in order to account for residual variability. We found crosshemispheric nonlinear interactions of ongoing cortical rhythms concurrent with the stimulus and interactions of these rhythms with the following AM responses. Finally, we hypothesized that holistic spatial stimuli would be accompanied by the emergence of clusters in primary visual cortex resulting in frequency-specific MEG oscillations. Indeed, we found different frequency distributions in induced gamma oscillations for different spatial stimuli, which was suggestive of temporal coding of these spatial stimuli. Further, we addressed the bursting character of these oscillations, which was suggestive of intermittent nonlinear dynamics. However, we did not observe the characteristic-3/2 power-law scaling in the distribution of interburst intervals. Further, this distribution was only seldom significantly different to the one obtained in surrogate data, where nonlinear structure was destroyed. In conclusion, the work presented in this thesis suggests that advances in dynamical systems theory in conjunction with developments in magnetoencephalography may facilitate a mapping between levels of description int he brain. this may potentially represent a major advancement in neuroscience.

Lithological mapping of Northwest Argentina with remote sensing data using tonal, textural and contextual features

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tonal, textural and contextual properties are used in manual photointerpretation of remotely sensed data. This study has used these three attributes to produce a lithological map of semi arid northwest Argentina by semi automatic computer classification procedures of remotely sensed data. Three different types of satellite data were investigated, these were LANDSAT MSS, TM and SIR-A imagery. Supervised classification procedures using tonal features only produced poor classification results. LANDSAT MSS produced classification accuracies in the range of 40 to 60%, while accuracies of 50 to 70% were achieved using LANDSAT TM data. The addition of SIR-A data produced increases in the classification accuracy. The increased classification accuracy of TM over the MSS is because of the better discrimination of geological materials afforded by the middle infra red bands of the TM sensor. The maximum likelihood classifier consistently produced classification accuracies 10 to 15% higher than either the minimum distance to means or decision tree classifier, this improved accuracy was obtained at the cost of greatly increased processing time. A new type of classifier the spectral shape classifier, which is computationally as fast as a minimum distance to means classifier is described. However, the results for this classifier were disappointing, being lower in most cases than the minimum distance or decision tree procedures. The classification results using only tonal features were felt to be unacceptably poor, therefore textural attributes were investigated. Texture is an important attribute used by photogeologists to discriminate lithology. In the case of TM data, texture measures were found to increase the classification accuracy by up to 15%. However, in the case of the LANDSAT MSS data the use of texture measures did not provide any significant increase in the accuracy of classification. For TM data, it was found that second order texture, especially the SGLDM based measures, produced highest classification accuracy. Contextual post processing was found to increase classification accuracy and improve the visual appearance of classified output by removing isolated misclassified pixels which tend to clutter classified images. Simple contextual features, such as mode filters were found to out perform more complex features such as gravitational filter or minimal area replacement methods. Generally the larger the size of the filter, the greater the increase in the accuracy. Production rules were used to build a knowledge based system which used tonal and textural features to identify sedimentary lithologies in each of the two test sites. The knowledge based system was able to identify six out of ten lithologies correctly.

Modelling data and voice traffic over IP networks using continuous-time Markov models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Common approaches to IP-traffic modelling have featured the use of stochastic models, based on the Markov property, which can be classified into black box and white box models based on the approach used for modelling traffic. White box models, are simple to understand, transparent and have a physical meaning attributed to each of the associated parameters. To exploit this key advantage, this thesis explores the use of simple classic continuous-time Markov models based on a white box approach, to model, not only the network traffic statistics but also the source behaviour with respect to the network and application. The thesis is divided into two parts: The first part focuses on the use of simple Markov and Semi-Markov traffic models, starting from the simplest two-state model moving upwards to n-state models with Poisson and non-Poisson statistics. The thesis then introduces the convenient to use, mathematically derived, Gaussian Markov models which are used to model the measured network IP traffic statistics. As one of the most significant contributions, the thesis establishes the significance of the second-order density statistics as it reveals that, in contrast to first-order density, they carry much more unique information on traffic sources and behaviour. The thesis then exploits the use of Gaussian Markov models to model these unique features and finally shows how the use of simple classic Markov models coupled with use of second-order density statistics provides an excellent tool for capturing maximum traffic detail, which in itself is the essence of good traffic modelling. The second part of the thesis, studies the ON-OFF characteristics of VoIP traffic with reference to accurate measurements of the ON and OFF periods, made from a large multi-lingual database of over 100 hours worth of VoIP call recordings. The impact of the language, prosodic structure and speech rate of the speaker on the statistics of the ON-OFF periods is analysed and relevant conclusions are presented. Finally, an ON-OFF VoIP source model with log-normal transitions is contributed as an ideal candidate to model VoIP traffic and the results of this model are compared with those of previously published work.

Stand-alone groundwater desalination system using reverse osmosis combined with a cooled greenhouse for use in arid and semi-arid zones of India

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In many areas of northern India, salinity renders groundwater unsuitable for drinking and even for irrigation. Though membrane treatment can be used to remove the salt, there are some drawbacks to this approach e.g. (1) depletion of the groundwater due to over-abstraction, (2) saline contamination of surface water and soil caused by concentrate disposal and (3) high electricity usage. To address these issues, a system is proposed in which a photovoltaic-powered reverse osmosis (RO) system is used to irrigate a greenhouse (GH) in a stand-alone arrangement. The concentrate from the RO is supplied to an evaporative cooling system, thus reducing the volume of the concentrate so that finally it can be evaporated in a pond to solid for safe disposal. Based on typical meteorological data for Delhi, calculations based on mass and energy balance are presented to assess the sizing and cost of the system. It is shown that solar radiation, freshwater output and evapotranspiration demand are readily matched due to the approximately linear relation among these variables. The demand for concentrate varies independently, however, thus favouring the use of a variable recovery arrangement. Though enough water may be harvested from the GH roof to provide year-round irrigation, this would require considerable storage. Some practical options for storage tanks are discussed. An alternative use of rainwater is in misting to reduce peak temperatures in the summer. An example optimised design provides internal temperatures below 30EC (monthly average daily maxima) for 8 months of the year and costs about €36,000 for the whole system with GH floor area of 1000 m2 . Further work is needed to assess technical risks relating to scale-deposition in the membrane and evaporative pads, and to develop a business model that will allow such a project to succeed in the Indian rural context.

Quantitative neuropathology:data collection and statistical analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of quantitative methods has become increasingly important in the study of neuropathology and especially in neurodegenerative disease. Disorders such as Alzheimer's disease (AD) and the frontotemporal dementias (FTD) are characterized by the formation of discrete, microscopic, pathological lesions which play an important role in pathological diagnosis. This chapter reviews the advantages and limitations of the different methods of quantifying pathological lesions in histological sections including estimates of density, frequency, coverage, and the use of semi-quantitative scores. The sampling strategies by which these quantitative measures can be obtained from histological sections, including plot or quadrat sampling, transect sampling, and point-quarter sampling, are described. In addition, data analysis methods commonly used to analysis quantitative data in neuropathology, including analysis of variance (ANOVA), polynomial curve fitting, multiple regression, classification trees, and principal components analysis (PCA), are discussed. These methods are illustrated with reference to quantitative studies of a variety of neurodegenerative disorders.

Towards a unified understanding of event-related changes in the EEG:the Firefly model of synchronization through cross-frequency phase modulation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although event-related potentials (ERPs) are widely used to study sensory, perceptual and cognitive processes, it remains unknown whether they are phase-locked signals superimposed upon the ongoing electroencephalogram (EEG) or result from phase-alignment of the EEG. Previous attempts to discriminate between these hypotheses have been unsuccessful but here a new test is presented based on the prediction that ERPs generated by phase-alignment will be associated with event-related changes in frequency whereas evoked-ERPs will not. Using empirical mode decomposition (EMD), which allows measurement of narrow-band changes in the EEG without predefining frequency bands, evidence was found for transient frequency slowing in recognition memory ERPs but not in simulated data derived from the evoked model. Furthermore, the timing of phase-alignment was frequency dependent with the earliest alignment occurring at high frequencies. Based on these findings, the Firefly model was developed, which proposes that both evoked and induced power changes derive from frequency-dependent phase-alignment of the ongoing EEG. Simulated data derived from the Firefly model provided a close match with empirical data and the model was able to account for i) the shape and timing of ERPs at different scalp sites, ii) the event-related desynchronization in alpha and synchronization in theta, and iii) changes in the power density spectrum from the pre-stimulus baseline to the post-stimulus period. The Firefly Model, therefore, provides not only a unifying account of event-related changes in the EEG but also a possible mechanism for cross-frequency information processing.

An alternating boundary integral based method for a Cauchy problem for the Laplace equation in semi-infinite regions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider a Cauchy problem for the Laplace equation in a two-dimensional semi-infinite region with a bounded inclusion, i.e. the region is the intersection between a half-plane and the exterior of a bounded closed curve contained in the half-plane. The Cauchy data are given on the unbounded part of the boundary of the region and the aim is to construct the solution on the boundary of the inclusion. In 1989, Kozlov and Maz'ya [10] proposed an alternating iterative method for solving Cauchy problems for general strongly elliptic and formally self-adjoint systems in bounded domains. We extend their approach to our setting and in each iteration step mixed boundary value problems for the Laplace equation in the semi-infinite region are solved. Well-posedness of these mixed problems are investigated and convergence of the alternating procedure is examined. For the numerical implementation an efficient boundary integral equation method is proposed, based on the indirect variant of the boundary integral equation approach. The mixed problems are reduced to integral equations over the (bounded) boundary of the inclusion. Numerical examples are included showing the feasibility of the proposed method.

An iterative method based on boundary integrals for elliptic Cauchy problems in semi-infinite domains

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study, we investigate the problem of reconstruction of a stationary temperature field from given temperature and heat flux on a part of the boundary of a semi-infinite region containing an inclusion. This situation can be modelled as a Cauchy problem for the Laplace operator and it is an ill-posed problem in the sense of Hadamard. We propose and investigate a Landweber-Fridman type iterative method, which preserve the (stationary) heat operator, for the stable reconstruction of the temperature field on the boundary of the inclusion. In each iteration step, mixed boundary value problems for the Laplace operator are solved in the semi-infinite region. Well-posedness of these problems is investigated and convergence of the procedures is discussed. For the numerical implementation of these mixed problems an efficient boundary integral method is proposed which is based on the indirect variant of the boundary integral approach. Using this approach the mixed problems are reduced to integral equations over the (bounded) boundary of the inclusion. Numerical examples are included showing that stable and accurate reconstructions of the temperature field on the boundary of the inclusion can be obtained also in the case of noisy data. These results are compared with those obtained with the alternating iterative method.

«
1
2
»