976 resultados para Data Modeling


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Perfect information is seldom available to man or machines due to uncertainties inherent in real world problems. Uncertainties in geographic information systems (GIS) stem from either vague/ambiguous or imprecise/inaccurate/incomplete information and it is necessary for GIS to develop tools and techniques to manage these uncertainties. There is a widespread agreement in the GIS community that although GIS has the potential to support a wide range of spatial data analysis problems, this potential is often hindered by the lack of consistency and uniformity. Uncertainties come in many shapes and forms, and processing uncertain spatial data requires a practical taxonomy to aid decision makers in choosing the most suitable data modeling and analysis method. In this paper, we: (1) review important developments in handling uncertainties when working with spatial data and GIS applications; (2) propose a taxonomy of models for dealing with uncertainties in GIS; and (3) identify current challenges and future research directions in spatial data analysis and GIS for managing uncertainties.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

There is remarkable agreement in expectations today for vastly improved ocean data management a decade from now -- capabilities that will help to bring significant benefits to ocean research and to society. Advancing data management to such a degree, however, will require cultural and policy changes that are slow to effect. The technological foundations upon which data management systems are built are certain to continue advancing rapidly in parallel. These considerations argue for adopting attitudes of pragmatism and realism when planning data management strategies. In this paper we adopt those attitudes as we outline opportunities for progress in ocean data management. We begin with a synopsis of expectations for integrated ocean data management a decade from now. We discuss factors that should be considered by those evaluating candidate “standards”. We highlight challenges and opportunities in a number of technical areas, including “Web 2.0” applications, data modeling, data discovery and metadata, real-time operational data, archival of data, biological data management and satellite data management. We discuss the importance of investments in the development of software toolkits to accelerate progress. We conclude the paper by recommending a few specific, short term targets for implementation, that we believe to be both significant and achievable, and calling for action by community leadership to effect these advancements.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In a world of almost permanent and rapidly increasing electronic data availability, techniques of filtering, compressing, and interpreting this data to transform it into valuable and easily comprehensible information is of utmost importance. One key topic in this area is the capability to deduce future system behavior from a given data input. This book brings together for the first time the complete theory of data-based neurofuzzy modelling and the linguistic attributes of fuzzy logic in a single cohesive mathematical framework. After introducing the basic theory of data-based modelling, new concepts including extended additive and multiplicative submodels are developed and their extensions to state estimation and data fusion are derived. All these algorithms are illustrated with benchmark and real-life examples to demonstrate their efficiency. Chris Harris and his group have carried out pioneering work which has tied together the fields of neural networks and linguistic rule-based algortihms. This book is aimed at researchers and scientists in time series modeling, empirical data modeling, knowledge discovery, data mining, and data fusion.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Navigation of deep space probes is most commonly operated using the spacecraft Doppler tracking technique. Orbital parameters are determined from a series of repeated measurements of the frequency shift of a microwave carrier over a given integration time. Currently, both ESA and NASA operate antennas at several sites around the world to ensure the tracking of deep space probes. Just a small number of software packages are nowadays used to process Doppler observations. The Astronomical Institute of the University of Bern (AIUB) has recently started the development of Doppler data processing capabilities within the Bernese GNSS Software. This software has been extensively used for Precise Orbit Determination of Earth orbiting satellites using GPS data collected by on-board receivers and for subsequent determination of the Earth gravity field. In this paper, we present the currently achieved status of the Doppler data modeling and orbit determination capabilities in the Bernese GNSS Software using GRAIL data. In particular we will focus on the implemented orbit determination procedure used for the combined analysis of Doppler and intersatellite Ka-band data. We show that even at this earlier stage of the development we can achieve an accuracy of few mHz on two-way S-band Doppler observation and of 2 µm/s on KBRR data from the GRAIL primary mission phase.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

O trabalho desenvolvido analisa a Comunicação Social no contexto da internet e delineia novas metodologias de estudo para a área na filtragem de significados no âmbito científico dos fluxos de informação das redes sociais, mídias de notícias ou qualquer outro dispositivo que permita armazenamento e acesso a informação estruturada e não estruturada. No intento de uma reflexão sobre os caminhos, que estes fluxos de informação se desenvolvem e principalmente no volume produzido, o projeto dimensiona os campos de significados que tal relação se configura nas teorias e práticas de pesquisa. O objetivo geral deste trabalho é contextualizar a área da Comunicação Social dentro de uma realidade mutável e dinâmica que é o ambiente da internet e fazer paralelos perante as aplicações já sucedidas por outras áreas. Com o método de estudo de caso foram analisados três casos sob duas chaves conceituais a Web Sphere Analysis e a Web Science refletindo os sistemas de informação contrapostos no quesito discursivo e estrutural. Assim se busca observar qual ganho a Comunicação Social tem no modo de visualizar seus objetos de estudo no ambiente das internet por essas perspectivas. O resultado da pesquisa mostra que é um desafio para o pesquisador da Comunicação Social buscar novas aprendizagens, mas a retroalimentação de informação no ambiente colaborativo que a internet apresenta é um caminho fértil para pesquisa, pois a modelagem de dados ganha corpus analítico quando o conjunto de ferramentas promovido e impulsionado pela tecnologia permite isolar conteúdos e possibilita aprofundamento dos significados e suas relações.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With emerging trends for Internet of Things (IoT) and Smart Cities, complex data transformation, aggregation and visualization problems are becoming increasingly common. These tasks support improved business intelligence, analytics and enduser access to data. However, in most cases developers of these tasks are presented with challenging problems including noisy data, diverse data formats, data modeling and increasing demand for sophisticated visualization support. This paper describes our experiences with just such problems in the context of Household Travel Surveys data integration and harmonization. We describe a common approach for addressing these harmonizations. We then discuss a set of lessons that we have learned from our experience that we hope will be useful for others embarking on similar problems. We also identify several key directions and needs for future research and practical support in this area.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Queensland University of Technology (QUT) is a large multidisciplinary university located in Brisbane, Queensland, Australia. QUT is increasing its research focus and is developing its research support services. It has adopted a model of collaboration between the Library, High Performance Computing and Research Support (HPC) and more broadly with Information Technology Services (ITS). Research support services provided by the Library include the provision of information resources and discovery services, bibliographic management software, assistance with publishing (publishing strategies, identifying high impact journals, dealing with publishers and the peer review process), citation analysis and calculating authors’ H Index. Research data management services are being developed by the Library and HPC working in collaboration. The HPC group within ITS supports research computing infrastructure, research development and engagement activities, researcher consultation, high speed computation and data storage systems , 2D/ 3D (immersive) visualisation tools, parallelisation and optimization of research codes, statistics/ data modeling training and support (both qualitative and quantitative) and support for the university’s central Access Grid collaboration facility. Development and engagement activities include participation in research grants and papers, student supervision and internships and the sponsorship, incubation and adoption of new computing technologies for research. ITS also provides other services that support research including ICT training, research infrastructure (networking, data storage, federated access and authorization, virtualization) and corporate systems for research administration. Seminars and workshops are offered to increase awareness and uptake of new and existing services. A series of online surveys on eResearch practices and skills and a number of focus groups was conducted to better inform the development of research support services. Progress towards the provision of research support is described within the context organizational frameworks; resourcing; infrastructure; integration; collaboration; change management; engagement; awareness and skills; new services; and leadership. Challenges to be addressed include the need to redeploy existing operational resources toward new research support services, supporting a rapidly growing research profile across the university, the growing need for the use and support of IT in research programs, finding capacity to address the diverse research support needs across the disciplines, operationalising new research support services following their implementation in project mode, embedding new specialist staff roles, cross-skilling Liaison Librarians, and ensuring continued collaboration between stakeholders.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis explored the knowledge and reasoning of young children in solving novel statistical problems, and the influence of problem context and design on their solutions. It found that young children's statistical competencies are underestimated, and that problem design and context facilitated children's application of a wide range of knowledge and reasoning skills, none of which had been taught. A qualitative design-based research method, informed by the Models and Modeling perspective (Lesh & Doerr, 2003) underpinned the study. Data modelling activities incorporating picture story books were used to contextualise the problems. Children applied real-world understanding to problem solving, including attribute identification, categorisation and classification skills. Intuitive and metarepresentational knowledge together with inductive and probabilistic reasoning was used to make sense of data, and beginning awareness of statistical variation and informal inference was visible.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Simultaneous recordings of spike trains from multiple single neurons are becoming commonplace. Understanding the interaction patterns among these spike trains remains a key research area. A question of interest is the evaluation of information flow between neurons through the analysis of whether one spike train exerts causal influence on another. For continuous-valued time series data, Granger causality has proven an effective method for this purpose. However, the basis for Granger causality estimation is autoregressive data modeling, which is not directly applicable to spike trains. Various filtering options distort the properties of spike trains as point processes. Here we propose a new nonparametric approach to estimate Granger causality directly from the Fourier transforms of spike train data. We validate the method on synthetic spike trains generated by model networks of neurons with known connectivity patterns and then apply it to neurons limultaneously recorded from the thalamus and the primary somatosensory cortex of a squirrel monkey undergoing tactile stimulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Multielectrode neurophysiological recording and high-resolution neuroimaging generate multivariate data that are the basis for understanding the patterns of neural interactions. How to extract directions of information flow in brain networks from these data remains a key challenge. Research over the last few years has identified Granger causality as a statistically principled technique to furnish this capability. The estimation of Granger causality currently requires autoregressive modeling of neural data. Here, we propose a nonparametric approach based on widely used Fourier and wavelet transforms to estimate both pairwise and conditional measures of Granger causality, eliminating the need of explicit autoregressive data modeling. We demonstrate the effectiveness of this approach by applying it to synthetic data generated by network models with known connectivity and to local field potentials recorded from monkeys performing a sensorimotor task.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O comportamento de fases para sistemas binários com um hidrocarboneto leve e um pesado é muito importante tanto para o projeto real de um processo quanto para o desenvolvimento de modelos teóricos. Para atender a crescente demanda por informação experimental de equilíbrio de fases a altas pressões, o objetivo deste estudo é obter uma metodologia que substitua parcialmente ou maximize a pouca informação experimental disponível. Para isto propõe-se a modelagem do equilíbrio de fases em misturas de hidrocarboneto leve com um pesado, sem o conhecimento da estrutura molecular do pesado, inferindo-se os parâmetros do modelo a partir da modelagem de dados de ponto de bolha obtidos na literatura. Esta metodologia implica não só na descrição do equilíbrio de fases de um sistema como na estimação das propriedades críticas do pesado, de difícil obtenção devido ao craqueamento destes a altas temperaturas. Neste contexto, este estudo apresenta uma estratégia que estima indiretamente as propriedades críticas dos compostos pesados. Para isto, foram correlacionados dados experimentais de ponto de bolha de misturas binárias contendo um hidrocarboneto leve e um pesado, usando-se dois modelos: o de Peng-Robinson e o TPT1M (Teoria da Polimerização Termodinâmica de primeira ordem de Wertheim modificada). Os parâmetros ajustados com o modelo de Peng-Robinson correspondem diretamente às propriedades críticas do composto pesado, enquanto os ajustados com o modelo TPT1M foram usados para obtê-las. Esta estratégia fornece parâmetros dependentes do modelo, porém permite o cálculo de outras propriedades termodinâmicas, como a extrapolação da temperatura dos dados estudados. Além disso, acredita-se que a correlação dos parâmetros obtidos com as propriedades críticas disponíveis ajudará na caracterização de frações pesadas de composição desconhecida

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Apesar dos impactos ambientais ocasionados pela poluição e acidentes químicos, constata-se que algumas organizações ainda investem pouco na prevenção, redução ou eliminação de seus resíduos. Em algumas Instituições de Ensino e Pesquisa (IES) do Brasil, não é incomum o manejo inadequado dos resíduos perigosos gerados em laboratórios de ensino e pesquisa, aumentando tais riscos. Para minimizar ou eliminar tais riscos, há que se realizarem investimentos em processos tecnológicos de tratamento e na seleção de métodos adequados ao gerenciamento. O objetivo desta pesquisa foi modelar um Sistema de Gerenciamento Integrado de Resíduos Perigosos e validá-lo através de sua aplicação em estudo piloto nos laboratórios dos Institutos de Química e Biologia da Universidade do Estado do Rio de Janeiro. A pesquisa empírica e exploratória foi realizada através de revisão bibliográfica e coleta de dados sobre o estado da arte no gerenciamento de resíduos em algumas IES nacionais e internacionais, seguido da seleção do sistema adequado a ser modelado e aplicado nestes contextos. O trabalho de campo consistiu na coleta de dados através de observação direta e aplicação de questionário junto aos responsáveis pelos laboratórios. As etapas do estudo foram: levantamento das instalações dos laboratórios; observação do manejo e geração dos resíduos; elaboração do banco de dados; análise qualitativa e quantitativa dos dados; modelagem do Sistema de Gerenciamento Integrado de Resíduos Perigosos SIGIRPE; implantação do modelo; apresentação e avaliação dos resultados; elaboração do manual para uso do sistema. O monitoramento quantitativo de resíduos foi feito através de ferramentas do sistema para a sua análise temporal. Os resultados da pesquisa permitiram conhecer a dinâmica e os problemas existentes nos laboratórios, bem como verificar a potencialidade do modelo. Conclui-se que o SIGIRPE pode ser aplicado a outros contextos desde que seja adequado para tal fim. É imprescindível ter uma estrutura institucional que elabore o Plano de Gerenciamento Integrado de Resíduos e viabilize sua implementação. A universidade, enquanto formadora dos futuros profissionais, é um lócus privilegiado na construção e disseminação do conhecimento, tendo o dever de realizar boas práticas no trato das questões ambientais, em particular, com relação aos resíduos. Assim, elas devem estabelecer entre suas estratégias de ação, a inclusão de políticas ambientais em seus campi, onde a Educação Ambiental deve ser permanente. Espera-se que este trabalho contribua com o planejamento e o gerenciamento dos resíduos perigosos gerados em laboratórios e com as mudanças necessárias rumo à sustentabilidade ambiental. O SIGIRPE foi elaborado e testado, mas não foi possível verificar sua aplicação por outros usuários. É o que se espera com a continuidade desta pesquisa e no desenvolvimento de futuros trabalhos, tais como: teste do sistema em hospitais, laboratórios, clínicas; estudar outras aplicações na área de segurança química de laboratórios através da inclusão de roteiro de transporte interno de resíduos, rotas de fuga, mapas de risco, localização de equipamentos de proteção individual e coletiva; demonstrar a potencialidade de uso do sistema e sensibilizar os segmentos envolvidos através de palestras, mini-cursos e outras estratégias de informação em revistas científicas especializadas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Em 1828 foi observado um fenômeno no microscópio em que se visualizava minúsculos grãos de pólen mergulhados em um líquido em repouso que mexiam-se de forma aleatória, desenhando um movimento desordenado. A questão era compreender este movimento. Após cerca de 80 anos, Einstein (1905) desenvolveu uma formulação matemática para explicar este fenômeno, tratado por movimento Browniano, teoria cada vez mais desenvolvida em muitas das áreas do conhecimento, inclusive recentemente em modelagem computacional. Objetiva-se pontuar os pressupostos básicos inerentes ao passeio aleatório simples considerando experimentos com e sem problema de valor de contorno para melhor compreensão ao no uso de algoritmos aplicados a problemas computacionais. Foram explicitadas as ferramentas necessárias para aplicação de modelos de simulação do passeio aleatório simples nas três primeiras dimensões do espaço. O interesse foi direcionado tanto para o passeio aleatório simples como para possíveis aplicações para o problema da ruína do jogador e a disseminação de vírus em rede de computadores. Foram desenvolvidos algoritmos do passeio aleatório simples unidimensional sem e com o problema do valor de contorno na plataforma R. Similarmente, implementados para os espaços bidimensionais e tridimensionais,possibilitando futuras aplicações para o problema da disseminação de vírus em rede de computadores e como motivação ao estudo da Equação do Calor, embora necessita um maior embasamento em conceitos da Física e Probabilidade para dar continuidade a tal aplicação.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we introduce a novel approach to face recognition which simultaneously tackles three combined challenges: 1) uneven illumination; 2) partial occlusion; and 3) limited training data. The new approach performs lighting normalization, occlusion de-emphasis and finally face recognition, based on finding the largest matching area (LMA) at each point on the face, as opposed to traditional fixed-size local area-based approaches. Robustness is achieved with novel approaches for feature extraction, LMA-based face image comparison and unseen data modeling. On the extended YaleB and AR face databases for face identification, our method using only a single training image per person, outperforms other methods using a single training image, and matches or exceeds methods which require multiple training images. On the labeled faces in the wild face verification database, our method outperforms comparable unsupervised methods. We also show that the new method performs competitively even when the training images are corrupted.