35 resultados para text and data mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The proportion of population living in or around cites is more important than ever. Urban sprawl and car dependence have taken over the pedestrian-friendly compact city. Environmental problems like air pollution, land waste or noise, and health problems are the result of this still continuing process. The urban planners have to find solutions to these complex problems, and at the same time insure the economic performance of the city and its surroundings. At the same time, an increasing quantity of socio-economic and environmental data is acquired. In order to get a better understanding of the processes and phenomena taking place in the complex urban environment, these data should be analysed. Numerous methods for modelling and simulating such a system exist and are still under development and can be exploited by the urban geographers for improving our understanding of the urban metabolism. Modern and innovative visualisation techniques help in communicating the results of such models and simulations. This thesis covers several methods for analysis, modelling, simulation and visualisation of problems related to urban geography. The analysis of high dimensional socio-economic data using artificial neural network techniques, especially self-organising maps, is showed using two examples at different scales. The problem of spatiotemporal modelling and data representation is treated and some possible solutions are shown. The simulation of urban dynamics and more specifically the traffic due to commuting to work is illustrated using multi-agent micro-simulation techniques. A section on visualisation methods presents cartograms for transforming the geographic space into a feature space, and the distance circle map, a centre-based map representation particularly useful for urban agglomerations. Some issues on the importance of scale in urban analysis and clustering of urban phenomena are exposed. A new approach on how to define urban areas at different scales is developed, and the link with percolation theory established. Fractal statistics, especially the lacunarity measure, and scale laws are used for characterising urban clusters. In a last section, the population evolution is modelled using a model close to the well-established gravity model. The work covers quite a wide range of methods useful in urban geography. Methods should still be developed further and at the same time find their way into the daily work and decision process of urban planners. La part de personnes vivant dans une région urbaine est plus élevé que jamais et continue à croître. L'étalement urbain et la dépendance automobile ont supplanté la ville compacte adaptée aux piétons. La pollution de l'air, le gaspillage du sol, le bruit, et des problèmes de santé pour les habitants en sont la conséquence. Les urbanistes doivent trouver, ensemble avec toute la société, des solutions à ces problèmes complexes. En même temps, il faut assurer la performance économique de la ville et de sa région. Actuellement, une quantité grandissante de données socio-économiques et environnementales est récoltée. Pour mieux comprendre les processus et phénomènes du système complexe "ville", ces données doivent être traitées et analysées. Des nombreuses méthodes pour modéliser et simuler un tel système existent et sont continuellement en développement. Elles peuvent être exploitées par le géographe urbain pour améliorer sa connaissance du métabolisme urbain. Des techniques modernes et innovatrices de visualisation aident dans la communication des résultats de tels modèles et simulations. Cette thèse décrit plusieurs méthodes permettant d'analyser, de modéliser, de simuler et de visualiser des phénomènes urbains. L'analyse de données socio-économiques à très haute dimension à l'aide de réseaux de neurones artificiels, notamment des cartes auto-organisatrices, est montré à travers deux exemples aux échelles différentes. Le problème de modélisation spatio-temporelle et de représentation des données est discuté et quelques ébauches de solutions esquissées. La simulation de la dynamique urbaine, et plus spécifiquement du trafic automobile engendré par les pendulaires est illustrée à l'aide d'une simulation multi-agents. Une section sur les méthodes de visualisation montre des cartes en anamorphoses permettant de transformer l'espace géographique en espace fonctionnel. Un autre type de carte, les cartes circulaires, est présenté. Ce type de carte est particulièrement utile pour les agglomérations urbaines. Quelques questions liées à l'importance de l'échelle dans l'analyse urbaine sont également discutées. Une nouvelle approche pour définir des clusters urbains à des échelles différentes est développée, et le lien avec la théorie de la percolation est établi. Des statistiques fractales, notamment la lacunarité, sont utilisées pour caractériser ces clusters urbains. L'évolution de la population est modélisée à l'aide d'un modèle proche du modèle gravitaire bien connu. Le travail couvre une large panoplie de méthodes utiles en géographie urbaine. Toutefois, il est toujours nécessaire de développer plus loin ces méthodes et en même temps, elles doivent trouver leur chemin dans la vie quotidienne des urbanistes et planificateurs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The emergence of powerful new technologies, the existence of large quantities of data, and increasing demands for the extraction of added value from these technologies and data have created a number of significant challenges for those charged with both corporate and information technology management. The possibilities are great, the expectations high, and the risks significant. Organisations seeking to employ cloud technologies and exploit the value of the data to which they have access, be this in the form of "Big Data" available from different external sources or data held within the organisation, in structured or unstructured formats, need to understand the risks involved in such activities. Data owners have responsibilities towards the subjects of the data and must also, frequently, demonstrate that they are in compliance with current standards, laws and regulations. This thesis sets out to explore the nature of the technologies that organisations might utilise, identify the most pertinent constraints and risks, and propose a framework for the management of data from discovery to external hosting that will allow the most significant risks to be managed through the definition, implementation, and performance of appropriate internal control activities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper presents the Multiple Kernel Learning (MKL) approach as a modelling and data exploratory tool and applies it to the problem of wind speed mapping. Support Vector Regression (SVR) is used to predict spatial variations of the mean wind speed from terrain features (slopes, terrain curvature, directional derivatives) generated at different spatial scales. Multiple Kernel Learning is applied to learn kernels for individual features and thematic feature subsets, both in the context of feature selection and optimal parameters determination. An empirical study on real-life data confirms the usefulness of MKL as a tool that enhances the interpretability of data-driven models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advanced neuroinformatics tools are required for methods of connectome mapping, analysis, and visualization. The inherent multi-modality of connectome datasets poses new challenges for data organization, integration, and sharing. We have designed and implemented the Connectome Viewer Toolkit - a set of free and extensible open source neuroimaging tools written in Python. The key components of the toolkit are as follows: (1) The Connectome File Format is an XML-based container format to standardize multi-modal data integration and structured metadata annotation. (2) The Connectome File Format Library enables management and sharing of connectome files. (3) The Connectome Viewer is an integrated research and development environment for visualization and analysis of multi-modal connectome data. The Connectome Viewer's plugin architecture supports extensions with network analysis packages and an interactive scripting shell, to enable easy development and community contributions. Integration with tools from the scientific Python community allows the leveraging of numerous existing libraries for powerful connectome data mining, exploration, and comparison. We demonstrate the applicability of the Connectome Viewer Toolkit using Diffusion MRI datasets processed by the Connectome Mapper. The Connectome Viewer Toolkit is available from http://www.cmtk.org/

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Selective publication of studies, which is commonly called publication bias, is widely recognized. Over the years a new nomenclature for other types of bias related to non-publication or distortion related to the dissemination of research findings has been developed. However, several of these different biases are often still summarized by the term 'publication bias'. METHODS/DESIGN: As part of the OPEN Project (To Overcome failure to Publish nEgative fiNdings) we will conduct a systematic review with the following objectives:- To systematically review highly cited articles that focus on non-publication of studies and to present the various definitions of biases related to the dissemination of research findings contained in the articles identified.- To develop and discuss a new framework on nomenclature of various aspects of distortion in the dissemination process that leads to public availability of research findings in an international group of experts in the context of the OPEN Project.We will systematically search Web of Knowledge for highly cited articles that provide a definition of biases related to the dissemination of research findings. A specifically designed data extraction form will be developed and pilot-tested. Working in teams of two, we will independently extract relevant information from each eligible article.For the development of a new framework we will construct an initial table listing different levels and different hazards en route to making research findings public. An international group of experts will iteratively review the table and reflect on its content until no new insights emerge and consensus has been reached. DISCUSSION: Results are expected to be publicly available in mid-2013. This systematic review together with the results of other systematic reviews of the OPEN project will serve as a basis for the development of future policies and guidelines regarding the assessment and prevention of publication bias.