906 resultados para Exploratory statistical data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação de Mestrado, Gestão de Unidades de Saúde, Faculdade de Economia, Universidade do Algarve, 2016

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The study of complex systems has become a prestigious area of science, although relatively young . Its importance was demonstrated by the diversity of applications that several studies have already provided to various fields such as biology , economics and Climatology . In physics , the approach of complex systems is creating paradigms that influence markedly the new methods , bringing to Statistical Physics problems macroscopic level no longer restricted to classical studies such as those of thermodynamics . The present work aims to make a comparison and verification of statistical data on clusters of profiles Sonic ( DT ) , Gamma Ray ( GR ) , induction ( ILD ) , neutron ( NPHI ) and density ( RHOB ) to be physical measured quantities during exploratory drilling of fundamental importance to locate , identify and characterize oil reservoirs . Software were used : Statistica , Matlab R2006a , Origin 6.1 and Fortran for comparison and verification of the data profiles of oil wells ceded the field Namorado School by ANP ( National Petroleum Agency ) . It was possible to demonstrate the importance of the DFA method and that it proved quite satisfactory in that work, coming to the conclusion that the data H ( Hurst exponent ) produce spatial data with greater congestion . Therefore , we find that it is possible to find spatial pattern using the Hurst coefficient . The profiles of 56 wells have confirmed the existence of spatial patterns of Hurst exponents , ie parameter B. The profile does not directly assessed catalogs verification of geological lithology , but reveals a non-random spatial distribution

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the exponential growth of the usage of web-based map services, the web GIS application has become more and more popular. Spatial data index, search, analysis, visualization and the resource management of such services are becoming increasingly important to deliver user-desired Quality of Service. First, spatial indexing is typically time-consuming and is not available to end-users. To address this, we introduce TerraFly sksOpen, an open-sourced an Online Indexing and Querying System for Big Geospatial Data. Integrated with the TerraFly Geospatial database [1-9], sksOpen is an efficient indexing and query engine for processing Top-k Spatial Boolean Queries. Further, we provide ergonomic visualization of query results on interactive maps to facilitate the user’s data analysis. Second, due to the highly complex and dynamic nature of GIS systems, it is quite challenging for the end users to quickly understand and analyze the spatial data, and to efficiently share their own data and analysis results with others. Built on the TerraFly Geo spatial database, TerraFly GeoCloud is an extra layer running upon the TerraFly map and can efficiently support many different visualization functions and spatial data analysis models. Furthermore, users can create unique URLs to visualize and share the analysis results. TerraFly GeoCloud also enables the MapQL technology to customize map visualization using SQL-like statements [10]. Third, map systems often serve dynamic web workloads and involve multiple CPU and I/O intensive tiers, which make it challenging to meet the response time targets of map requests while using the resources efficiently. Virtualization facilitates the deployment of web map services and improves their resource utilization through encapsulation and consolidation. Autonomic resource management allows resources to be automatically provisioned to a map service and its internal tiers on demand. v-TerraFly are techniques to predict the demand of map workloads online and optimize resource allocations, considering both response time and data freshness as the QoS target. The proposed v-TerraFly system is prototyped on TerraFly, a production web map service, and evaluated using real TerraFly workloads. The results show that v-TerraFly can accurately predict the workload demands: 18.91% more accurate; and efficiently allocate resources to meet the QoS target: improves the QoS by 26.19% and saves resource usages by 20.83% compared to traditional peak load-based resource allocation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

MEGAGEO - Moving megaliths in the Neolithic is a project that intends to find the provenience of lithic materials in the construction of tombs. A multidisciplinary approach is carried out, with researchers from several of the knowledge fields involved. This work presents a spatial data warehouse specially developed for this project that comprises information from national archaeological databases, geographic and geological information and new geochemical and petrographic data obtained during the project. The use of the spatial data warehouse proved to be essential in the data analysis phase of the project. The Redondo Area is presented as a case study for the application of the spatial data warehouse to analyze the relations between geochemistry, geology and the tombs in this area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Collecting and analysing data is an important element in any field of human activity and research. Even in sports, collecting and analyzing statistical data is attracting a growing interest. Some exemplar use cases are: improvement of technical/tactical aspects for team coaches, definition of game strategies based on the opposite team play or evaluation of the performance of players. Other advantages are related to taking more precise and impartial judgment in referee decisions: a wrong decision can change the outcomes of important matches. Finally, it can be useful to provide better representations and graphic effects that make the game more engaging for the audience during the match. Nowadays it is possible to delegate this type of task to automatic software systems that can use cameras or even hardware sensors to collect images or data and process them. One of the most efficient methods to collect data is to process the video images of the sporting event through mixed techniques concerning machine learning applied to computer vision. As in other domains in which computer vision can be applied, the main tasks in sports are related to object detection, player tracking, and to the pose estimation of athletes. The goal of the present thesis is to apply different models of CNNs to analyze volleyball matches. Starting from video frames of a volleyball match, we reproduce a bird's eye view of the playing court where all the players are projected, reporting also for each player the type of action she/he is performing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, we investigate the role of applied physics in epidemiological surveillance through the application of mathematical models, network science and machine learning. The spread of a communicable disease depends on many biological, social, and health factors. The large masses of data available make it possible, on the one hand, to monitor the evolution and spread of pathogenic organisms; on the other hand, to study the behavior of people, their opinions and habits. Presented here are three lines of research in which an attempt was made to solve real epidemiological problems through data analysis and the use of statistical and mathematical models. In Chapter 1, we applied language-inspired Deep Learning models to transform influenza protein sequences into vectors encoding their information content. We then attempted to reconstruct the antigenic properties of different viral strains using regression models and to identify the mutations responsible for vaccine escape. In Chapter 2, we constructed a compartmental model to describe the spread of a bacterium within a hospital ward. The model was informed and validated on time series of clinical measurements, and a sensitivity analysis was used to assess the impact of different control measures. Finally (Chapter 3) we reconstructed the network of retweets among COVID-19 themed Twitter users in the early months of the SARS-CoV-2 pandemic. By means of community detection algorithms and centrality measures, we characterized users’ attention shifts in the network, showing that scientific communities, initially the most retweeted, lost influence over time to national political communities. In the Conclusion, we highlighted the importance of the work done in light of the main contemporary challenges for epidemiological surveillance. In particular, we present reflections on the importance of nowcasting and forecasting, the relationship between data and scientific research, and the need to unite the different scales of epidemiological surveillance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, product development in all its phases plays a fundamental role in the industrial chain. The need for a company to compete at high levels, the need to be quick in responding to market demands and therefore to be able to engineer the product quickly and with a high level of quality, has led to the need to get involved in new more advanced methods/ processes. In recent years, we are moving away from the concept of 2D-based design and production and approaching the concept of Model Based Definition. By using this approach, increasingly complex systems turn out to be easier to deal with but above all cheaper in obtaining them. Thanks to the Model Based Definition it is possible to share data in a lean and simple way to the entire engineering and production chain of the product. The great advantage of this approach is precisely the uniqueness of the information. In this specific thesis work, this approach has been exploited in the context of tolerances with the aid of CAD / CAT software. Tolerance analysis or dimensional variation analysis is a way to understand how sources of variation in part size and assembly constraints propagate between parts and assemblies and how that range affects the ability of a project to meet its requirements. It is critically important to note how tolerance directly affects the cost and performance of products. Worst Case Analysis (WCA) and Statistical analysis (RSS) are the two principal methods in DVA. The thesis aims to show the advantages of using statistical dimensional analysis by creating and examining various case studies, using PTC CREO software for CAD modeling and CETOL 6σ for tolerance analysis. Moreover, it will be provided a comparison between manual and 3D analysis, focusing the attention to the information lost in the 1D case. The results obtained allow us to highlight the need to use this approach from the early stages of the product design cycle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To assess the completeness and reliability of the Information System on Live Births (Sinasc) data. A cross-sectional analysis of the reliability and completeness of Sinasc's data was performed using a sample of Live Birth Certificate (LBC) from 2009, related to births from Campinas, Southeast Brazil. For data analysis, hospitals were grouped according to category of service (Unified National Health System, private or both), 600 LBCs were randomly selected and the data were collected in LBC-copies through mothers and newborns' hospital records and by telephone interviews. The completeness of LBCs was evaluated, calculating the percentage of blank fields, and the LBCs agreement comparing the originals with the copies was evaluated by Kappa and intraclass correlation coefficients. The percentage of completeness of LBCs ranged from 99.8%-100%. For the most items, the agreement was excellent. However, the agreement was acceptable for marital status, maternal education and newborn infants' race/color, low for prenatal visits and presence of birth defects, and very low for the number of deceased children. The results showed that the municipality Sinasc is reliable for most of the studied variables. Investments in training of the professionals are suggested in an attempt to improve system capacity to support planning and implementation of health activities for the benefit of maternal and child population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Universidade Estadual de Campinas . Faculdade de Educação Física

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Universidade Estadual de Campinas . Faculdade de Educação Física

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Universidade Estadual de Campinas. Faculdade de Educação Física

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJETIVO: Este artigo analisa e compara os dados de consumo alimentar de duas populações ribeirinhas da Amazônia vivendo em ecossistemas contrastantes de floresta tropical: a várzea estacional e a floresta de terra firme. MÉTODOS: Foi estudado o consumo alimentar de 11 unidades domésticas na várzea (Ilha de Ituqui, Município de Santarém) e 17 na terra firme (Floresta Nacional de Caxiuanã, Municípios de Melgaço e Portel). O método utilizado foi o recordatório de 24 horas. As análises estatísticas foram executadas com o auxílio do programa Statistical Package for Social Sciences 12.0. RESULTADOS: Em ambos os ecossistemas, os resultados confirmam a centralidade do pescado e da mandioca na dieta local. Porém, a contribuição de outros itens alimentares secundários, tais como o açaí (em Caxiuanã) e o leite in natura (em Ituqui), também foi significante. Além disso, o açúcar revelou ser uma fonte de energia confiável para enfrentar as flutuações sazonais dos recursos naturais. Parece haver ainda uma maior contribuição energética dos peixes para a dieta de Ituqui, provavelmente em função da maior produtividade dos rios e lagos da várzea em relação à terra firme. Por fim, Ituqui revelou uma maior dependência de itens alimentares comprados, enquanto Caxiuanã mostrou estar ainda bastante vinculada à agricultura e às redes locais de troca. CONCLUSÃO: Além dos resultados confirmarem a importância do pescado e da mandioca, também mostraram que produtos industrializados, como o açúcar, têm um papel importante nas dietas, podendo apontar para tendências no consumo alimentar relacionadas com a atual transição nutricional e com a erosão, em diferentes níveis, dos sistemas de subsistência locais.