848 resultados para Spatial data mining
Resumo:
Esta monografía presenta los fundamentos, contexto y detalles técnicos de un Esquema de Aplicación para la incorporación de datos espaciales relativos al patrimonio cultural en el marco definido por la directiva europea INSPIRE sobre información geográfica. Abstract: This monograph presents the background, context and technical details of an Application Schema for the inclusion of cultural heritage spatial data into the INSPIRE framework. Nowadays, INSPIRE provides the most relevant framework for the dissemination and exchange of geographical data, covering many different thematic fields, particularly relevant for envi-ronmental datasets. Although cultural heritage elements are partially addressed within INSPIRE, there is no specific documentation on how these data should be considered, structured and published. This text aims to provide technical guidelines for decision makers, public administrations and the scientific community for the definition and implementation of harmonized datasets for cultural heritage, according to the interoperability principles of INSPIRE.
Resumo:
Clinicians could model the brain injury of a patient through his brain activity. However, how this model is defined and how it changes when the patient is recovering are questions yet unanswered. In this paper, the use of MedVir framework is proposed with the aim of answering these questions. Based on complex data mining techniques, this provides not only the differentiation between TBI patients and control subjects (with a 72% of accuracy using 0.632 Bootstrap validation), but also the ability to detect whether a patient may recover or not, and all of that in a quick and easy way through a visualization technique which allows interaction.
Resumo:
The mobile apps market is a tremendous success, with millions of apps downloaded and used every day by users spread all around the world. For apps’ developers, having their apps published on one of the major app stores (e.g. Google Play market) is just the beginning of the apps lifecycle. Indeed, in order to successfully compete with the other apps in the market, an app has to be updated frequently by adding new attractive features and by fixing existing bugs. Clearly, any developer interested in increasing the success of her app should try to implement features desired by the app’s users and to fix bugs affecting the user experience of many of them. A precious source of information to decide how to collect users’ opinions and wishes is represented by the reviews left by users on the store from which they downloaded the app. However, to exploit such information the app’s developer should manually read each user review and verify if it contains useful information (e.g. suggestions for new features). This is something not doable if the app receives hundreds of reviews per day, as happens for the very popular apps on the market. In this work, our aim is to provide support to mobile apps developers by proposing a novel approach exploiting data mining, natural language processing, machine learning, and clustering techniques in order to classify the user reviews on the basis of the information they contain (e.g. useless, suggestion for new features, bugs reporting). Such an approach has been empirically evaluated and made available in a web-‐based tool publicly available to all apps’ developers. The achieved results showed that the developed tool: (i) is able to correctly categorise user reviews on the basis of their content (e.g. isolating those reporting bugs) with 78% of accuracy, (ii) produces clusters of reviews (e.g. groups together reviews indicating exactly the same bug to be fixed) that are meaningful from a developer’s point-‐of-‐view, and (iii) is considered useful by a software company working in the mobile apps’ development market.
Resumo:
© The Author(s) 2014. Acknowledgements We thank the Information Services Division, Scotland, who provided the SMR01 data, and NHS Grampian, who provided the biochemistry data. We also thank the University of Aberdeen’s Data Management Team. Funding This work was supported by the Chief Scientists Office for Scotland (grant no. CZH/4/656).
Resumo:
The environmental, cultural and socio-economic causes and consequences of farmland abandonment are issues of increasing concern for researchers and policy makers. In previous studies, we proposed a new methodology for selecting the driving factors in farmland abandonment processes. Using Data Mining and GIS, it is possible to select those variables which are more significantly related to abandonment. The aim of this study is to investigate the application of the above mentioned methodology for finding relationships between relief and farmland abandonment in a Mediterranean region (SE Spain).We have taken into account up to 28 different variables in a single analysis, some of them commonly considered in land use change studies (slope, altitude, TWI, etc), but also other novel variables have been evaluated (sky view factor, terrain view factor, etc). The variable selection process provides results in line with the previous knowledge of the study area, describing some processes that are region specific (e.g. abandonment versus intensification of the agricultural activities). The European INSPIRE Directive (2007/2/EC) establishes that the digital elevation models for land surfaces should be available in all member countries, this means that the research described in this work can be extrapolated to any European country to determine whether these variables (slope, altitude, etc) are important in the process of abandonment.
Resumo:
Este trabajo analiza las nuevas tendencias en la creación y gestión de información geográfica, para la elaboración de modelos inductivos basados exclusivamente en bases de datos geográficas. Estos modelos permiten integrar grandes volúmenes de datos de características heterogéneas, lo que supone una gran complejidad técnica y metodológica. Se propone una metodología que permite conocer detalladamente la distribución de los recursos hídricos naturales en un territorio y derivar numerosas capas de información que puedan ser incorporadas a estos modelos «ávidos de datos» (data-hungry). La zona de estudio escogida para aplicar esta metodología es la comarca de la Marina Baja (Alicante), para la que se presenta un cálculo del balance hídrico espacial mediante el uso de herramientas estadísticas, geoestadísticas y Sistemas de Información Geográfica. Finalmente, todas las capas de información generadas (84) han sido validadas y se ha comprobado que su creación admite un cierto grado de automatización que permitirá incorporarlas en análisis de Minería de Datos más amplios.
Resumo:
The spatial data set delineates areas with similar environmental properties regarding soil, terrain morphology, climate and affiliation to the same administrative unit (NUTS3 or comparable units in size) at a minimum pixel size of 1km2. The scope of developing this data set is to provide a link between spatial environmental information (e.g. soil properties) and statistical data (e.g. crop distribution) available at administrative level. Impact assessment of agricultural management on emissions of pollutants or radiative active gases, or analysis regarding the influence of agricultural management on the supply of ecosystem services, require the proper spatial coincidence of the driving factors. The HSU data set provides e.g. the link between the agro-economic model CAPRI and biophysical assessment of environmental impacts (updating previously spatial units, Leip et al. 2008), for the analysis of policy scenarios. Recently, a statistical model to disaggregate crop information available from regional statistics to the HSU has been developed (Lamboni et al. 2016). The HSU data set consists of the spatial layers provided in vector and raster format as well as attribute tables with information on the properties of the HSU. All input data for the delineation the HSU is publicly available. For some parameters the attribute tables provide the link between the HSU data set and e.g. the soil map(s) rather than the data itself. The HSU data set is closely linked the USCIE data set.
Resumo:
The Brazilian state of Paraná exhibits a violent geography of inequality and duality, hosting both the most developed city in the country, internationally recognized by its urban and environmental innovations, and southern Brazil’s most concentrated cluster of poverty and underdevelopment. Over the course of the past decades, the state underwent a major economic transformation, modernizing and increasing its industrial structure and shifting to the service sector with a larger participation of the knowledge economy. This study is concerned on the interplay between formal education and socioeconomic development during this process, and above all its spatial character. It attempts make sense of the rich literature on education and growth and/or development, discussing it through the lenses of human geography and planning. In order for the analysis to be possible, this study created a consistent database of municipal scores of education over the course of 40 years, dealing with changing census methodologies and municipal boundaries. Making use of modern exploratory spatial data analysis combined with spatial regressions, the study identifies a clustered, time-persistent interplay between education and development that is stronger for low and basic levels of education. Moreover, it provides evidence that not only education is a predictor of future development, but also that analyses of this kind must take into consideration spatial autocorrelation in order to be accurate.
Resumo:
Remotely sensed data have been used extensively for environmental monitoring and modeling at a number of spatial scales; however, a limited range of satellite imaging systems often. constrained the scales of these analyses. A wider variety of data sets is now available, allowing image data to be selected to match the scale of environmental structure(s) or process(es) being examined. A framework is presented for use by environmental scientists and managers, enabling their spatial data collection needs to be linked to a suitable form of remotely sensed data. A six-step approach is used, combining image spatial analysis and scaling tools, within the context of hierarchy theory. The main steps involved are: (1) identification of information requirements for the monitoring or management problem; (2) development of ideal image dimensions (scene model), (3) exploratory analysis of existing remotely sensed data using scaling techniques, (4) selection and evaluation of suitable remotely sensed data based on the scene model, (5) selection of suitable spatial analytic techniques to meet information requirements, and (6) cost-benefit analysis. Results from a case study show that the framework provided an objective mechanism to identify relevant aspects of the monitoring problem and environmental characteristics for selecting remotely sensed data and analysis techniques.
Resumo:
Online geographic information systems provide the means to extract a subset of desired spatial information from a larger remote repository. Data retrieved representing real-world geographic phenomena are then manipulated to suit the specific needs of an end-user. Often this extraction requires the derivation of representations of objects specific to a particular resolution or scale from a single original stored version. Currently standard spatial data handling techniques cannot support the multi-resolution representation of such features in a database. In this paper a methodology to store and retrieve versions of spatial objects at, different resolutions with respect to scale using standard database primitives and SQL is presented. The technique involves heavy fragmentation of spatial features that allows dynamic simplification into scale-specific object representations customised to the display resolution of the end-user's device. Experimental results comparing the new approach to traditional R-Tree indexing and external object simplification reveal the former performs notably better for mobile and WWW applications where client-side resources are limited and retrieved data loads are kept relatively small.
Resumo:
This paper develops an Internet geographical information system (GIS) and spatial model application that provides socio-economic information and exploratory spatial data analysis for local government authorities (LGAs) in Queensland, Australia. The application aims to improve the means by which large quantities of data may be analysed, manipulated and displayed in order to highlight trends and patterns as well as provide performance benchmarking that is readily understandable and easily accessible for decision-makers. Measures of attribute similarity and spatial proximity are combined in a clustering model with a spatial autocorrelation index for exploratory spatial data analysis to support the identification of spatial patterns of change. Analysis of socio-economic changes in Queensland is presented. The results demonstrate the usefulness and potential appeal of the Internet GIS applications as a tool to inform the process of regional analysis, planning and policy.
Resumo:
We combine spatial data on home ranges of individuals and microsatellite markers to examine patterns of fine-scale spatial genetic structure and dispersal within a brush-tailed rock-wallaby (Petrogale penicillata) colony at Hurdle Creek Valley, Queensland. Brush-tailed rock-wallabies were once abundant and widespread throughout the rocky terrain of southeastern Australia; however, populations are nearly extinct in the south of their range and in decline elsewhere. We use pairwise relatedness measures and a recent multilocus spatial autocorrelation analysis to test the hypotheses that in this species, within-colony dispersal is male-biased and that female philopatry results in spatial clusters of related females within the colony. We provide clear evidence for strong female philopatry and male-biased dispersal within this rock-wallaby colony. There was a strong, significant negative correlation between pairwise relatedness and geographical distance of individual females along only 800 m of cliff line. Spatial genetic autocorrelation analyses showed significant positive correlation for females in close proximity to each other and revealed a genetic neighbourhood size of only 600 m for females. Our study is the first to report on the fine-scale spatial genetic structure within a rock-wallaby colony and we provide the first robust evidence for strong female philopatry and spatial clustering of related females within this taxon. We discuss the ecological and conservation implications of our findings for rock-wallabies, as well as the importance of fine-scale spatial genetic patterns in studies of dispersal behaviour.
Resumo:
We have performed a systematic temporal and spatial expression profiling of the developing mouse kidney using Compugen long-oligonucleotide microarrays. The activity of 18,000 genes was monitored at 24-h intervals from 10.5-day-postcoitum (dpc) metanephric mesenchyme (MM) through to neonatal kidney, and a cohort of 3,600 dynamically expressed genes was identified. Early metanephric development was further surveyed by directly comparing RNA from 10.5 vs. 11.5 vs. 13.5dpc kidneys. These data showed high concordance with the previously published dynamic profile of rat kidney development (Stuart RO, Bush KT, and Nigam SK. Proc Natl Acad Sci USA 98: 5649-5654, 2001) and our own temporal data. Cluster analyses were used to identify gene ontological terms, functional annotations, and pathways associated with temporal expression profiles. Genetic network analysis was also used to identify biological networks that have maximal transcriptional activity during early metanephric development, highlighting the involvement of proliferation and differentiation. Differential gene expression was validated using whole mount and section in situ hybridization of staged embryonic kidneys. Two spatial profiling experiments were also undertaken. MM (10.5dpc) was compared with adjacent intermediate mesenchyme to further define metanephric commitment. To define the genes involved in branching and in the induction of nephrogenesis, expression profiling was performed on ureteric bud (GFP+) FACS sorted from HoxB7-GFP transgenic mice at 15.5dpc vs. the GFP- mesenchymal derivatives. Comparisons between temporal and spatial data enhanced the ability to predict function for genes and networks. This study provides the most comprehensive temporal and spatial survey of kidney development to date, and the compilation of these transcriptional surveys provides important insights into metanephric development that can now be functionally tested.
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
Spatial data has now been used extensively in the Web environment, providing online customized maps and supporting map-based applications. The full potential of Web-based spatial applications, however, has yet to be achieved due to performance issues related to the large sizes and high complexity of spatial data. In this paper, we introduce a multiresolution approach to spatial data management and query processing such that the database server can choose spatial data at the right resolution level for different Web applications. One highly desirable property of the proposed approach is that the server-side processing cost and network traffic can be reduced when the level of resolution required by applications are low. Another advantage is that our approach pushes complex multiresolution structures and algorithms into the spatial database engine. That is, the developer of spatial Web applications needs not to be concerned with such complexity. This paper explains the basic idea, technical feasibility and applications of multiresolution spatial databases.