13 resultados para multivariate binary data

em Aston University Research Archive


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A visualization plot of a data set of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM, and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries) and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection) and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the data sets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the data sets used to evaluate clustering by activity, LTM again gives the best performance but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map and a Bernoulli noise model for modeling binary data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Visual mental imagery is a complex process that may be influenced by the content of mental images. Neuropsychological evidence from patients with hemineglect suggests that in the imagery domain environments and objects may be represented separately and may be selectively affected by brain lesions. In the present study, we used functional magnetic resonance imaging (fMRI) to assess the possibility of neural segregation among mental images depicting parts of an object, of an environment (imagined from a first-person perspective), and of a geographical map, using both a mass univariate and a multivariate approach. Data show that different brain areas are involved in different types of mental images. Imagining an environment relies mainly on regions known to be involved in navigational skills, such as the retrosplenial complex and parahippocampal gyrus, whereas imagining a geographical map mainly requires activation of the left angular gyrus, known to be involved in the representation of categorical relations. Imagining a familiar object mainly requires activation of parietal areas involved in visual space analysis in both the imagery and the perceptual domain. We also found that the pattern of activity in most of these areas specifically codes for the spatial arrangement of the parts of the mental image. Our results clearly demonstrate a functional neural segregation for different contents of mental images and suggest that visuospatial information is coded by different patterns of activity in brain areas involved in visual mental imagery. Hum Brain Mapp 36:945-958, 2015.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A multistage distillation column in which mass transfer and a reversible chemical reaction occurred simultaneously, has been investigated to formulate a technique by which this process can be analysed or predicted. A transesterification reaction between ethyl alcohol and butyl acetate, catalysed by concentrated sulphuric acid, was selected for the investigation and all the components were analysed on a gas liquid chromatograph. The transesterification reaction kinetics have been studied in a batch reactor for catalyst concentrations of 0.1 - 1.0 weight percent and temperatures between 21.4 and 85.0 °C. The reaction was found to be second order and dependent on the catalyst concentration at a given temperature. The vapour liquid equilibrium data for six binary, four ternary and one quaternary systems are measured at atmospheric pressure using a modified Cathala dynamic equilibrium still. The systems with the exception of ethyl alcohol - butyl alcohol mixtures, were found to be non-ideal. Multicomponent vapour liquid equilibrium compositions were predicted by a computer programme which utilised the Van Laar constants obtained from the binary data sets. Good agreement was obtained between the predicted and experimental quaternary equilibrium vapour compositions. Continuous transesterification experiments were carried out in a six stage sieve plate distillation column. The column was 3" in internal diameter and of unit construction in glass. The plates were 8" apart and had a free area of 7.7%. Both the liquid and vapour streams were analysed. The component conversion was dependent on the boilup rate and the reflux ratio. Because of the presence of the reaction, the concentration of one of the lighter components increased below the feed plate. In the same region a highly developed foam was formed due to the presence of the catalyst. The experimental results were analysed by the solution of a series of simultaneous enthalpy and mass equations. Good agreement was obtained between the experimental and calculated results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Descriptions of vegetation communities are often based on vague semantic terms describing species presence and dominance. For this reason, some researchers advocate the use of fuzzy sets in the statistical classification of plant species data into communities. In this study, spatially referenced vegetation abundance values collected from Greek phrygana were analysed by ordination (DECORANA), and classified on the resulting axes using fuzzy c-means to yield a point data-set representing local memberships in characteristic plant communities. The fuzzy clusters matched vegetation communities noted in the field, which tended to grade into one another, rather than occupying discrete patches. The fuzzy set representation of the community exploited the strengths of detrended correspondence analysis while retaining richer information than a TWINSPAN classification of the same data. Thus, in the absence of phytosociological benchmarks, meaningful and manageable habitat information could be derived from complex, multivariate species data. We also analysed the influence of the reliability of different surveyors' field observations by multiple sampling at a selected sample location. We show that the impact of surveyor error was more severe in the Boolean than the fuzzy classification. © 2007 Springer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models (HMMs) to identify the lag (or delay) between different variables for such data. We first present a method using maximum likelihood estimation and propose a simple algorithm which is capable of identifying associations between variables. We also adopt an information-theoretic approach and develop a novel procedure for training HMMs to maximise the mutual information between delayed time series. Both methods are successfully applied to real data. We model the oil drilling process with HMMs and estimate a crucial parameter, namely the lag for return.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When applying multivariate analysis techniques in information systems and social science disciplines, such as management information systems (MIS) and marketing, the assumption that the empirical data originate from a single homogeneous population is often unrealistic. When applying a causal modeling approach, such as partial least squares (PLS) path modeling, segmentation is a key issue in coping with the problem of heterogeneity in estimated cause-and-effect relationships. This chapter presents a new PLS path modeling approach which classifies units on the basis of the heterogeneity of the estimates in the inner model. If unobserved heterogeneity significantly affects the estimated path model relationships on the aggregate data level, the methodology will allow homogenous groups of observations to be created that exhibit distinctive path model estimates. The approach will, thus, provide differentiated analytical outcomes that permit more precise interpretations of each segment formed. An application on a large data set in an example of the American customer satisfaction index (ACSI) substantiates the methodology’s effectiveness in evaluating PLS path modeling results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Overlaying maps using a desktop GIS is often the first step of a multivariate spatial analysis. The potential of this operation has increased considerably as data sources an dWeb services to manipulate them are becoming widely available via the Internet. Standards from the OGC enable such geospatial ‘mashups’ to be seamless and user driven, involving discovery of thematic data. The user is naturally inclined to look for spatial clusters and ‘correlation’ of outcomes. Using classical cluster detection scan methods to identify multivariate associations can be problematic in this context, because of a lack of control on or knowledge about background populations. For public health and epidemiological mapping, this limiting factor can be critical but often the focus is on spatial identification of risk factors associated with health or clinical status. In this article we point out that this association itself can ensure some control on underlying populations, and develop an exploratory scan statistic framework for multivariate associations. Inference using statistical map methodologies can be used to test the clustered associations. The approach is illustrated with a hypothetical data example and an epidemiological study on community MRSA. Scenarios of potential use for online mashups are introduced but full implementation is left for further research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strategic planning and more specifically, the impact of strategic planning on organisational performance has been the subject of significant academic interest since the early 1970's. However, despite the significant amount of previous work examining the relationship between strategic planning and organisational performance, a comprehensive literature review identified a number of areas where contributions to the domain of study could be made. In overview, the main areas for further study identified from the literature review were a) a further examination of both the dimensionality and conceptualisation of strategic planning and organisational performance and b) a further, multivariate, examination of the relationship between strategic planning and performance, to capture the newly identified dimensionality. In addition to the previously identified strategic planning and organisational performance constructs, a comprehensive literature based assessment was undertaken and five main areas were identified for further examination, these were a) organisational b) comprehensive strategic choice, c) the quality of strategic options generated, d) political behavior and e) implementation success. From this, a conceptual model incorporating a set of hypotheses to be tested was formulated. In order to test the conceptual model specified and also the stated hypotheses, data gathering was undertaken. The quantitative phase of the research involved a mail survey of senior managers in medium to large UK based organisations, of which a total of 366 fully useable responses were received. Following rigorous individual construct validity and reliability testing, the complete conceptual model was tested using latent variable path analysis. The results for the individual hypotheses and also the complete conceptual model were most encouraging. The findings, theoretical and managerial implications, limitations and directions for future research are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We discuss aggregation of data from neuropsychological patients and the process of evaluating models using data from a series of patients. We argue that aggregation can be misleading but not aggregating can also result in information loss. The basis for combining data needs to be theoretically defined, and the particular method of aggregation depends on the theoretical question and characteristics of the data. We present examples, often drawn from our own research, to illustrate these points. We also argue that statistical models and formal methods of model selection are a useful way to test theoretical accounts using data from several patients in multiple-case studies or case series. Statistical models can often measure fit in a way that explicitly captures what a theory allows; the parameter values that result from model fitting often measure theoretically important dimensions and can lead to more constrained theories or new predictions; and model selection allows the strength of evidence for models to be quantified without forcing this into the artificial binary choice that characterizes hypothesis testing methods. Methods that aggregate and then formally model patient data, however, are not automatically preferred to other methods. Which method is preferred depends on the question to be addressed, characteristics of the data, and practical issues like availability of suitable patients, but case series, multiple-case studies, single-case studies, statistical models, and process models should be complementary methods when guided by theory development.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the results of a multivariate spatial analysis of 38 vowel formant variables in the language of 402 informants from 236 cities from across the contiguous United States, based on the acoustic data from the Atlas of North American English (Labov, Ash & Boberg, 2006). The results of the analysis both confirm and challenge the results of the Atlas. Most notably, while the analysis identifies similar patterns as the Atlas in the West and the Southeast, the analysis finds that the Midwest and the Northeast are distinct dialect regions that are considerably stronger than the traditional Midland and Northern dialect region indentified in the Atlas. The analysis also finds evidence that a western vowel shift is actively shaping the language of the Western United States.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We examine the efficiency of multivariate macroeconomic forecasts by estimating a vector autoregressive model on the forecast revisions of four variables (GDP, inflation, unemployment and wages). Using a data set of professional forecasts for the G7 countries, we find evidence of cross‐series revision dynamics. Specifically, forecasts revisions are conditionally correlated to the lagged forecast revisions of other macroeconomic variables, and the sign of the correlation is as predicted by conventional economic theory. This indicates that forecasters are slow to incorporate news across variables. We show that this finding can be explained by forecast underreaction.