918 resultados para Spatial analysis statistics -- Data processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Overlaying maps using a desktop GIS is often the first step of a multivariate spatial analysis. The potential of this operation has increased considerably as data sources an dWeb services to manipulate them are becoming widely available via the Internet. Standards from the OGC enable such geospatial ‘mashups’ to be seamless and user driven, involving discovery of thematic data. The user is naturally inclined to look for spatial clusters and ‘correlation’ of outcomes. Using classical cluster detection scan methods to identify multivariate associations can be problematic in this context, because of a lack of control on or knowledge about background populations. For public health and epidemiological mapping, this limiting factor can be critical but often the focus is on spatial identification of risk factors associated with health or clinical status. In this article we point out that this association itself can ensure some control on underlying populations, and develop an exploratory scan statistic framework for multivariate associations. Inference using statistical map methodologies can be used to test the clustered associations. The approach is illustrated with a hypothetical data example and an epidemiological study on community MRSA. Scenarios of potential use for online mashups are introduced but full implementation is left for further research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis describes the development of a complete data visualisation system for large tabular databases, such as those commonly found in a business environment. A state-of-the-art 'cyberspace cell' data visualisation technique was investigated and a powerful visualisation system using it was implemented. Although allowing databases to be explored and conclusions drawn, it had several drawbacks, the majority of which were due to the three-dimensional nature of the visualisation. A novel two-dimensional generic visualisation system, known as MADEN, was then developed and implemented, based upon a 2-D matrix of 'density plots'. MADEN allows an entire high-dimensional database to be visualised in one window, while permitting close analysis in 'enlargement' windows. Selections of records can be made and examined, and dependencies between fields can be investigated in detail. MADEN was used as a tool for investigating and assessing many data processing algorithms, firstly data-reducing (clustering) methods, then dimensionality-reducing techniques. These included a new 'directed' form of principal components analysis, several novel applications of artificial neural networks, and discriminant analysis techniques which illustrated how groups within a database can be separated. To illustrate the power of the system, MADEN was used to explore customer databases from two financial institutions, resulting in a number of discoveries which would be of interest to a marketing manager. Finally, the database of results from the 1992 UK Research Assessment Exercise was analysed. Using MADEN allowed both universities and disciplines to be graphically compared, and supplied some startling revelations, including empirical evidence of the 'Oxbridge factor'.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This book is aimed primarily at microbiologists who are undertaking research and who require a basic knowledge of statistics to analyse their experimental data. Computer software employing a wide range of data analysis methods is widely available to experimental scientists. The availability of this software, however, makes it essential that investigators understand the basic principles of statistics. Statistical analysis of data can be complex with many different methods of approach, each of which applies in a particular experimental circumstance. Hence, it is possible to apply an incorrect statistical method to data and to draw the wrong conclusions from an experiment. The purpose of this book, which has its origin in a series of articles published in the Society for Applied Microbiology journal ‘The Microbiologist’, is an attempt to present the basic logic of statistics as clearly as possible and therefore, to dispel some of the myths that often surround the subject. The 28 ‘Statnotes’ deal with various topics that are likely to be encountered, including the nature of variables, the comparison of means of two or more groups, non-parametric statistics, analysis of variance, correlating variables, and more complex methods such as multiple linear regression and principal components analysis. In each case, the relevant statistical method is illustrated with examples drawn from experiments in microbiological research. The text incorporates a glossary of the most commonly used statistical terms and there are two appendices designed to aid the investigator in the selection of the most appropriate test.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exploratory analysis of data seeks to find common patterns to gain insights into the structure and distribution of the data. In geochemistry it is a valuable means to gain insights into the complicated processes making up a petroleum system. Typically linear visualisation methods like principal components analysis, linked plots, or brushing are used. These methods can not directly be employed when dealing with missing data and they struggle to capture global non-linear structures in the data, however they can do so locally. This thesis discusses a complementary approach based on a non-linear probabilistic model. The generative topographic mapping (GTM) enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate more structure than a two dimensional principal components plot. The model can deal with uncertainty, missing data and allows for the exploration of the non-linear structure in the data. In this thesis a novel approach to initialise the GTM with arbitrary projections is developed. This makes it possible to combine GTM with algorithms like Isomap and fit complex non-linear structure like the Swiss-roll. Another novel extension is the incorporation of prior knowledge about the structure of the covariance matrix. This extension greatly enhances the modelling capabilities of the algorithm resulting in better fit to the data and better imputation capabilities for missing data. Additionally an extensive benchmark study of the missing data imputation capabilities of GTM is performed. Further a novel approach, based on missing data, will be introduced to benchmark the fit of probabilistic visualisation algorithms on unlabelled data. Finally the work is complemented by evaluating the algorithms on real-life datasets from geochemical projects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of quantitative methods has become increasingly important in the study of neuropathology and especially in neurodegenerative disease. Disorders such as Alzheimer's disease (AD) and the frontotemporal dementias (FTD) are characterized by the formation of discrete, microscopic, pathological lesions which play an important role in pathological diagnosis. This chapter reviews the advantages and limitations of the different methods of quantifying pathological lesions in histological sections including estimates of density, frequency, coverage, and the use of semi-quantitative scores. The sampling strategies by which these quantitative measures can be obtained from histological sections, including plot or quadrat sampling, transect sampling, and point-quarter sampling, are described. In addition, data analysis methods commonly used to analysis quantitative data in neuropathology, including analysis of variance (ANOVA), polynomial curve fitting, multiple regression, classification trees, and principal components analysis (PCA), are discussed. These methods are illustrated with reference to quantitative studies of a variety of neurodegenerative disorders.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Remote sensing data is routinely used in ecology to investigate the relationship between landscape pattern as characterised by land use and land cover maps, and ecological processes. Multiple factors related to the representation of geographic phenomenon have been shown to affect characterisation of landscape pattern resulting in spatial uncertainty. This study investigated the effect of the interaction between landscape spatial pattern and geospatial processing methods statistically; unlike most papers which consider the effect of each factor in isolation only. This is important since data used to calculate landscape metrics typically undergo a series of data abstraction processing tasks and are rarely performed in isolation. The geospatial processing methods tested were the aggregation method and the choice of pixel size used to aggregate data. These were compared to two components of landscape pattern, spatial heterogeneity and the proportion of landcover class area. The interactions and their effect on the final landcover map were described using landscape metrics to measure landscape pattern and classification accuracy (response variables). All landscape metrics and classification accuracy were shown to be affected by both landscape pattern and by processing methods. Large variability in the response of those variables and interactions between the explanatory variables were observed. However, even though interactions occurred, this only affected the magnitude of the difference in landscape metric values. Thus, provided that the same processing methods are used, landscapes should retain their ranking when their landscape metrics are compared. For example, highly fragmented landscapes will always have larger values for the landscape metric "number of patches" than less fragmented landscapes. But the magnitude of difference between the landscapes may change and therefore absolute values of landscape metrics may need to be interpreted with caution. The explanatory variables which had the largest effects were spatial heterogeneity and pixel size. These explanatory variables tended to result in large main effects and large interactions. The high variability in the response variables and the interaction of the explanatory variables indicate it would be difficult to make generalisations about the impact of processing on landscape pattern as only two processing methods were tested and it is likely that untested processing methods will potentially result in even greater spatial uncertainty. © 2013 Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this research was to investigate the effects of Processing Instruction (VanPatten, 1996, 2007), as an input-based model for teaching second language grammar, on Syrian learners’ processing abilities. The present research investigated the effects of Processing Instruction on the acquisition of English relative clauses by Syrian learners in the form of a quasi-experimental design. Three separate groups were involved in the research (Processing Instruction, Traditional Instruction and a Control Group). For assessment, a pre-test, a direct post-test and a delayed post-test were used as main tools for eliciting data. A questionnaire was also distributed to participants in the Processing Instruction group to give them the opportunity to give feedback in relation to the treatment they received in comparison with the Traditional Instruction they are used to. Four hypotheses were formulated on the possible effectivity of Processing Instruction on Syrian learners’ linguistic system. It was hypothesised that Processing Instruction would improve learners’ processing abilities leading to an improvement in learners’ linguistic system. This was expected to lead to a better performance when it comes to the comprehension and production of English relative clauses. The main source of data was analysed statistically using the ANOVA test. Cohen’s d calculations were also used to support the ANOVA test. Cohen’s d showed the magnitude of effects of the three treatments. Results of the analysis showed that both Processing Instruction and Traditional Instruction groups had improved after treatment. However, the Processing Instruction Group significantly outperformed the other two groups in the comprehension of relative clauses. The analysis concluded that Processing Instruction is a useful tool for instructing relative clauses to Syrian learners. This was enhanced by participants’ responses to the questionnaire as they were in favour of Processing Instruction, rather than Traditional Instruction. This research has theoretical and pedagogical implications. Theoretically, the study showed support for the Input hypothesis. That is, it was shown that Processing Instruction had a positive effect on input processing as it affected learners’ linguistic system. This was reflected in learners’ performance where learners were able to produce a structure which they had not been asked to produce. Pedagogically, the present research showed that Processing Instruction is a useful tool for teaching English grammar in the context where the experiment was carried out, as it had a large effect on learners’ performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We re-analysed visuo-spatial perspective taking data from Kessler and Thomson (2010) plus a previously unpublished pilot with respect to individual- and sex differences in embodied processing (defined as body-posture congruence effects). We found that so-called 'systemisers' (males/low-social-skills) showed weaker embodiment than so-called 'embodiers' (females/high-social-skills). We conclude that 'systemisers' either have difficulties with embodied processing or, alternatively, they have a strategic advantage in selecting different mechanisms or the appropriate level of embodiment. In contrast, 'embodiers' have an advantageous strategy of "deep" embodied processing reflecting their urge to empathise or, alternatively, less flexibility in fine-tuning the involvement of bodily representations. © 2012 Copyright Taylor and Francis Group, LLC.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Very often the experimental data are the realization of the process, fully determined by some unknown function, being distorted by hindrances. Treatment and experimental data analysis are substantially facilitated, if these data to represent as analytical expression. The experimental data processing algorithm and the example of using this algorithm for spectrographic analysis of oncologic preparations of blood is represented in this article.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the nonparametric framework of Data Envelopment Analysis the statistical properties of its estimators have been investigated and only asymptotic results are available. For DEA estimators results of practical use have been proved only for the case of one input and one output. However, in the real world problems the production process is usually well described by many variables. In this paper a machine learning approach to variable aggregation based on Canonical Correlation Analysis is presented. This approach is applied for efficiency estimation of all the farms in Terceira Island of the Azorean archipelago.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fluorescence spectroscopy has recently become more common in clinical medicine. However, there are still many unresolved issues related to the methodology and implementation of instruments with this technology. In this study, we aimed to assess individual variability of fluorescence parameters of endogenous markers (NADH, FAD, etc.) measured by fluorescent spectroscopy (FS) in situ and to analyse the factors that lead to a significant scatter of results. Most studied fluorophores have an acceptable scatter of values (mostly up to 30%) for diagnostic purposes. Here we provide evidence that the level of blood volume in tissue impacts FS data with a significant inverse correlation. The distribution function of the fluorescence intensity and the fluorescent contrast coefficient values are a function of the normal distribution for most of the studied fluorophores and the redox ratio. The effects of various physiological (different content of skin melanin) and technical (characteristics of optical filters) factors on the measurement results were additionally studied.The data on the variability of the measurement results in FS should be considered when interpreting the diagnostic parameters, as well as when developing new algorithms for data processing and FS devices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current reform initiatives recommend that geometry instruction include the study of three-dimensional geometric objects and provide students with opportunities to use spatial skills in problem-solving tasks. Geometer's Sketchpad (GSP) is a dynamic and interactive computer program that enables the user to investigate and explore geometric concepts and manipulate geometric structures. Research using GSP as an instructional tool has focused primarily on teaching and learning two-dimensional geometry. This study explored the effect of a GSP based instructional environment on students' geometric thinking and three-dimensional spatial ability as they used GSP to learn three-dimensional geometry. For 10 weeks, 18 tenth-grade students from an urban school district used GSP to construct and analyze dynamic, two-dimensional representations of three-dimensional objects in a classroom environment that encouraged exploration, discussion, conjecture, and verification. The data were collected primarily from participant observations and clinical interviews and analyzed using qualitative methods of analysis. In addition, pretest and posttest measures of three-dimensional spatial ability and van Hiele level of geometric thinking were obtained. Spatial ability measures were analyzed using standard t-test analysis. ^ The data from this study indicate that GSP is a viable tool to teach students about three-dimensional geometric objects. A comparison of students' pretest and posttest van Hiele levels showed an improvement in geometric thinking, especially for students on lower levels of the van Hiele theory. Evidence at the p < .05 level indicated that students' spatial ability improved significantly. Specifically, the GSP dynamic, visual environment supported students' visualization and reasoning processes as students attempted to solve challenging tasks about three-dimensional geometric objects. The GSP instructional activities also provided students with an experiential base and an intuitive understanding about three-dimensional objects from which more formal work in geometry could be pursued. This study demonstrates that by designing appropriate GSP based instructional environments, it is possible to help students improve their spatial skills, develop more coherent and accurate intuitions about three-dimensional geometric objects, and progress through the levels of geometric thinking proposed by van Hiele. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research presents several components encompassing the scope of the objective of Data Partitioning and Replication Management in Distributed GIS Database. Modern Geographic Information Systems (GIS) databases are often large and complicated. Therefore data partitioning and replication management problems need to be addresses in development of an efficient and scalable solution. ^ Part of the research is to study the patterns of geographical raster data processing and to propose the algorithms to improve availability of such data. These algorithms and approaches are targeting granularity of geographic data objects as well as data partitioning in geographic databases to achieve high data availability and Quality of Service(QoS) considering distributed data delivery and processing. To achieve this goal a dynamic, real-time approach for mosaicking digital images of different temporal and spatial characteristics into tiles is proposed. This dynamic approach reuses digital images upon demand and generates mosaicked tiles only for the required region according to user's requirements such as resolution, temporal range, and target bands to reduce redundancy in storage and to utilize available computing and storage resources more efficiently. ^ Another part of the research pursued methods for efficient acquiring of GIS data from external heterogeneous databases and Web services as well as end-user GIS data delivery enhancements, automation and 3D virtual reality presentation. ^ There are vast numbers of computing, network, and storage resources idling or not fully utilized available on the Internet. Proposed "Crawling Distributed Operating System "(CDOS) approach employs such resources and creates benefits for the hosts that lend their CPU, network, and storage resources to be used in GIS database context. ^ The results of this dissertation demonstrate effective ways to develop a highly scalable GIS database. The approach developed in this dissertation has resulted in creation of TerraFly GIS database that is used by US government, researchers, and general public to facilitate Web access to remotely-sensed imagery and GIS vector information. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation established a software-hardware integrated design for a multisite data repository in pediatric epilepsy. A total of 16 institutions formed a consortium for this web-based application. This innovative fully operational web application allows users to upload and retrieve information through a unique human-computer graphical interface that is remotely accessible to all users of the consortium. A solution based on a Linux platform with My-SQL and Personal Home Page scripts (PHP) has been selected. Research was conducted to evaluate mechanisms to electronically transfer diverse datasets from different hospitals and collect the clinical data in concert with their related functional magnetic resonance imaging (fMRI). What was unique in the approach considered is that all pertinent clinical information about patients is synthesized with input from clinical experts into 4 different forms, which were: Clinical, fMRI scoring, Image information, and Neuropsychological data entry forms. A first contribution of this dissertation was in proposing an integrated processing platform that was site and scanner independent in order to uniformly process the varied fMRI datasets and to generate comparative brain activation patterns. The data collection from the consortium complied with the IRB requirements and provides all the safeguards for security and confidentiality requirements. An 1-MR1-based software library was used to perform data processing and statistical analysis to obtain the brain activation maps. Lateralization Index (LI) of healthy control (HC) subjects in contrast to localization-related epilepsy (LRE) subjects were evaluated. Over 110 activation maps were generated, and their respective LIs were computed yielding the following groups: (a) strong right lateralization: (HC=0%, LRE=18%), (b) right lateralization: (HC=2%, LRE=10%), (c) bilateral: (HC=20%, LRE=15%), (d) left lateralization: (HC=42%, LRE=26%), e) strong left lateralization: (HC=36%, LRE=31%). Moreover, nonlinear-multidimensional decision functions were used to seek an optimal separation between typical and atypical brain activations on the basis of the demographics as well as the extent and intensity of these brain activations. The intent was not to seek the highest output measures given the inherent overlap of the data, but rather to assess which of the many dimensions were critical in the overall assessment of typical and atypical language activations with the freedom to select any number of dimensions and impose any degree of complexity in the nonlinearity of the decision space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The major objectives of this dissertation were to develop optimal spatial techniques to model the spatial-temporal changes of the lake sediments and their nutrients from 1988 to 2006, and evaluate the impacts of the hurricanes occurred during 1998–2006. Mud zone reduced about 10.5% from 1988 to 1998, and increased about 6.2% from 1998 to 2006. Mud areas, volumes and weight were calculated using validated Kriging models. From 1988 to 1998, mud thicknesses increased up to 26 cm in the central lake area. The mud area and volume decreased about 13.78% and 10.26%, respectively. From 1998 to 2006, mud depths declined by up to 41 cm in the central lake area, mud volume reduced about 27%. Mud weight increased up to 29.32% from 1988 to 1998, but reduced over 20% from 1998 to 2006. The reduction of mud sediments is likely due to re-suspension and redistribution by waves and currents produced by large storm events, particularly Hurricanes Frances and Jeanne in 2004 and Wilma in 2005. Regression, kriging, geographically weighted regression (GWR) and regression-kriging models have been calibrated and validated for the spatial analysis of the sediments TP and TN of the lake. GWR models provide the most accurate predictions for TP and TN based on model performance and error analysis. TP values declined from an average of 651 to 593 mg/kg from 1998 to 2006, especially in the lake’s western and southern regions. From 1988 to 1998, TP declined in the northern and southern areas, and increased in the central-western part of the lake. The TP weights increased about 37.99%–43.68% from 1988 to 1998 and decreased about 29.72%–34.42% from 1998 to 2006. From 1988 to 1998, TN decreased in most areas, especially in the northern and southern lake regions; western littoral zone had the biggest increase, up to 40,000 mg/kg. From 1998 to 2006, TN declined from an average of 9,363 to 8,926 mg/kg, especially in the central and southern regions. The biggest increases occurred in the northern lake and southern edge areas. TN weights increased about 15%–16.2% from 1988 to 1998, and decreased about 7%–11% from 1998 to 2006.