918 resultados para Spatial analysis statistics -- Data processing
Resumo:
OBJECTIVE To analyze the spatial distribution of homicide mortality in the state of Bahia, Northeastern Brazil. METHODS Ecological study of the 15 to 39-year old male population in the state of Bahia in the period 1996-2010. Data from the Mortality Information System, relating to homicide (X85-Y09) and population estimates from the Brazilian Institute of Geography and Statistics were used. The existence of spatial correlation, the presence of clusters and critical areas of the event studied were analyzed using Moran’s I Global and Local indices. RESULTS A non-random spatial pattern was observed in the distribution of rates, as was the presence of three clusters, the first in the north health district, the second in the eastern region, and the third cluster included townships in the south and the far south of Bahia. CONCLUSIONS The homicide mortality in the three different critical areas requires further studies that consider the socioeconomic, cultural and environmental characteristics in order to guide specific preventive and interventionist practices.
Resumo:
OBJECTIVE To analyze temporal trends and distribution patterns of unsafe abortion in Brazil. METHODS Ecological study based on records of hospital admissions of women due to abortion in Brazil between 1996 and 2012, obtained from the Hospital Information System of the Ministry of Health. We estimated the number of unsafe abortions stratified by place of residence, using indirect estimate techniques. The following indicators were calculated: ratio of unsafe abortions/100 live births and rate of unsafe abortion/1,000 women of childbearing age. We analyzed temporal trends through polynomial regression and spatial distribution using municipalities as the unit of analysis. RESULTS In the study period, a total of 4,007,327 hospital admissions due to abortions were recorded in Brazil. We estimated a total of 16,905,911 unsafe abortions in the country, with an annual mean of 994,465 abortions (mean unsafe abortion rate: 17.0 abortions/1,000 women of childbearing age; ratio of unsafe abortions: 33.2/100 live births). Unsafe abortion presented a declining trend at national level (R2: 94.0%, p < 0.001), with unequal patterns between regions. There was a significant reduction of unsafe abortion in the Northeast (R2: 93.0%, p < 0.001), Southeast (R2: 92.0%, p < 0.001) and Central-West regions (R2: 64.0%, p < 0.001), whereas the North (R2: 39.0%, p = 0.030) presented an increase, and the South (R2: 22.0%, p = 0.340) remained stable. Spatial analysis identified the presence of clusters of municipalities with high values for unsafe abortion, located mainly in states of the North, Northeast and Southeast Regions. CONCLUSIONS Unsafe abortion remains a public health problem in Brazil, with marked regional differences, mainly concentrated in the socioeconomically disadvantaged regions of the country. Qualification of attention to women’s health, especially to reproductive aspects and attention to pre- and post-abortion processes, are necessary and urgent strategies to be implemented in the country.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Introduction In 1999, Birigui and Araçatuba were the first municipalities in the State of São Paulo to present autochthonous cases of visceral leishmaniasis in humans (VLH). The aim of this study was to describe the temporal, spatial and spatiotemporal behaviors of VLH in Birigui. Methods Secondary data were obtained from the Notifiable Diseases Information System from 1999 to 2012. The incidence, mortality and case fatality rates by sex and age were calculated. The cases of VLH were geocoded and grouped according to census tracts. Local empirical Bayesian incidence rates were calculated. The existence of spatial and spatiotemporal clusters was investigated using SaTScan software. Results There were 156 confirmed cases of autochthonous VLH. The incidence rate was higher in the 0-4-year-old children, and the mortality and case fatality rates were higher in people aged 60 years and older. The peaks of incidence occurred in 2006 and 2011. The Bayesian rates identified the presence of VLH in all of the census tracts in the municipality; however, spatial and spatiotemporal clusters were found in the central area of the municipality. Conclusions Birigui, located in the Araçatuba region, has recently experienced increasing numbers of VLH cases; this increase is contrary to the behavior observed over the entire region, which has shown a decreasing trend in the number of VLH cases. The observations that the highest incidence is in children 0-4 years old and the highest mortality is in people 60 years and older are in agreement with the expected patterns of VLH.
Resumo:
Abstract: INTRODUCTION : Several municipalities of the Western region of the State of São Paulo have been affected by human visceral leishmaniasis (HVL), including the City of Adamantina, where the first autochthonous cases occurred in 2004. Therefore, this study aimed to describe the spatial and spatiotemporal occurrence of HVL in Adamantina. METHODS : Secondary data regarding the occurrence of HVL in Adamantina between 2004 and 2011 were used. Incidence, mortality, and case fatality rates were calculated. We used local empirical Bayesian incidence rates to represent the occurrence of the disease in the census sector of the city. The existence of spatial and spatiotemporal clusters of cases was evaluated using scan statistics. In situ observation was performed to assess the socioeconomic and environmental characteristics of the areas with medium and high incidences. RESULTS : Adamantina reported cases in 70% of its census sectors. No differences were observed between sexes. The group aged 0-4 years presented the highest incidence and mortality rates, and the group aged 40-59 years presented the highest fatality rate. We detected a spatiotemporal cluster, which coincided with the commencement of the endemic in the city. CONCLUSIONS : The individuals most affected by the disease were children. The disease was present in areas with better and worse socioeconomic conditions. The use of spatial analysis techniques was important to achieve the study objectives.
Resumo:
Recently, there has been a growing interest in the field of metabolomics, materialized by a remarkable growth in experimental techniques, available data and related biological applications. Indeed, techniques as Nuclear Magnetic Resonance, Gas or Liquid Chromatography, Mass Spectrometry, Infrared and UV-visible spectroscopies have provided extensive datasets that can help in tasks as biological and biomedical discovery, biotechnology and drug development. However, as it happens with other omics data, the analysis of metabolomics datasets provides multiple challenges, both in terms of methodologies and in the development of appropriate computational tools. Indeed, from the available software tools, none addresses the multiplicity of existing techniques and data analysis tasks. In this work, we make available a novel R package, named specmine, which provides a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, and feature selection. Importantly, the implemented methods provide adequate support for the analysis of data from diverse experimental techniques, integrating a large set of functions from several R packages in a powerful, yet simple to use environment. The package, already available in CRAN, is accompanied by a web site where users can deposit datasets, scripts and analysis reports to be shared with the community, promoting the efficient sharing of metabolomics data analysis pipelines.
Resumo:
Acute cases of schistosomiasis have been found on the coastal area of Pernambuco, Brazil, due to environmental disturbances and disorderly occupation of the urban areas. This study identifies and spatially marks the main foci of the snail host species, Biomphalaria glabrata on Itamaracá Island. The chaotic occupation of the beach resorts has favoured the emergence of transmission foci, thus exposing residents and tourists to the risk of infection. A database covering five years of epidemiological investigation on snails infected by Schistosoma mansoni in the island was produced with information from the geographic positioning of the foci, number of snails collected, number of snails tested positive, and their infection rate. The spatial position of the foci were recorded through the Global Positioning System (GPS), and the geographical coordinates were imported by AutoCad. The software packages ArcView and Spring were used for data processing and spatial analysis. AutoCad 2000 was used to plot the pairs of coordinates obtained from GPS. Between 1998 and 2002 5009 snails, of which 12.2% were positive for S. mansoni, were collected in Forte Beach. A total of 27 foci and areas of environmental risk were identified and spatially analyzed allowing the identification of the areas exposed to varying degrees of risk.
Resumo:
Dual scaling of a subjects-by-objects table of dominance data (preferences,paired comparisons and successive categories data) has been contrasted with correspondence analysis, as if the two techniques were somehow different. In this note we show that dual scaling of dominance data is equivalent to the correspondence analysis of a table which is doubled with respect to subjects. We also show that the results of both methods can be recovered from a principal components analysis of the undoubled dominance table which is centred with respect to subject means.
Resumo:
La présente étude est à la fois une évaluation du processus de la mise en oeuvre et des impacts de la police de proximité dans les cinq plus grandes zones urbaines de Suisse - Bâle, Berne, Genève, Lausanne et Zurich. La police de proximité (community policing) est à la fois une philosophie et une stratégie organisationnelle qui favorise un partenariat renouvelé entre la police et les communautés locales dans le but de résoudre les problèmes relatifs à la sécurité et à l'ordre public. L'évaluation de processus a analysé des données relatives aux réformes internes de la police qui ont été obtenues par l'intermédiaire d'entretiens semi-structurés avec des administrateurs clés des cinq départements de police, ainsi que dans des documents écrits de la police et d'autres sources publiques. L'évaluation des impacts, quant à elle, s'est basée sur des variables contextuelles telles que des statistiques policières et des données de recensement, ainsi que sur des indicateurs d'impacts construit à partir des données du Swiss Crime Survey (SCS) relatives au sentiment d'insécurité, à la perception du désordre public et à la satisfaction de la population à l'égard de la police. Le SCS est un sondage régulier qui a permis d'interroger des habitants des cinq grandes zones urbaines à plusieurs reprises depuis le milieu des années 1980. L'évaluation de processus a abouti à un « Calendrier des activités » visant à créer des données de panel permettant de mesurer les progrès réalisés dans la mise en oeuvre de la police de proximité à l'aide d'une grille d'évaluation à six dimensions à des intervalles de cinq ans entre 1990 et 2010. L'évaluation des impacts, effectuée ex post facto, a utilisé un concept de recherche non-expérimental (observational design) dans le but d'analyser les impacts de différents modèles de police de proximité dans des zones comparables à travers les cinq villes étudiées. Les quartiers urbains, délimités par zone de code postal, ont ainsi été regroupés par l'intermédiaire d'une typologie réalisée à l'aide d'algorithmes d'apprentissage automatique (machine learning). Des algorithmes supervisés et non supervisés ont été utilisés sur les données à haute dimensionnalité relatives à la criminalité, à la structure socio-économique et démographique et au cadre bâti dans le but de regrouper les quartiers urbains les plus similaires dans des clusters. D'abord, les cartes auto-organisatrices (self-organizing maps) ont été utilisées dans le but de réduire la variance intra-cluster des variables contextuelles et de maximiser simultanément la variance inter-cluster des réponses au sondage. Ensuite, l'algorithme des forêts d'arbres décisionnels (random forests) a permis à la fois d'évaluer la pertinence de la typologie de quartier élaborée et de sélectionner les variables contextuelles clés afin de construire un modèle parcimonieux faisant un minimum d'erreurs de classification. Enfin, pour l'analyse des impacts, la méthode des appariements des coefficients de propension (propensity score matching) a été utilisée pour équilibrer les échantillons prétest-posttest en termes d'âge, de sexe et de niveau d'éducation des répondants au sein de chaque type de quartier ainsi identifié dans chacune des villes, avant d'effectuer un test statistique de la différence observée dans les indicateurs d'impacts. De plus, tous les résultats statistiquement significatifs ont été soumis à une analyse de sensibilité (sensitivity analysis) afin d'évaluer leur robustesse face à un biais potentiel dû à des covariables non observées. L'étude relève qu'au cours des quinze dernières années, les cinq services de police ont entamé des réformes majeures de leur organisation ainsi que de leurs stratégies opérationnelles et qu'ils ont noué des partenariats stratégiques afin de mettre en oeuvre la police de proximité. La typologie de quartier développée a abouti à une réduction de la variance intra-cluster des variables contextuelles et permet d'expliquer une partie significative de la variance inter-cluster des indicateurs d'impacts avant la mise en oeuvre du traitement. Ceci semble suggérer que les méthodes de géocomputation aident à équilibrer les covariables observées et donc à réduire les menaces relatives à la validité interne d'un concept de recherche non-expérimental. Enfin, l'analyse des impacts a révélé que le sentiment d'insécurité a diminué de manière significative pendant la période 2000-2005 dans les quartiers se trouvant à l'intérieur et autour des centres-villes de Berne et de Zurich. Ces améliorations sont assez robustes face à des biais dus à des covariables inobservées et covarient dans le temps et l'espace avec la mise en oeuvre de la police de proximité. L'hypothèse alternative envisageant que les diminutions observées dans le sentiment d'insécurité soient, partiellement, un résultat des interventions policières de proximité semble donc être aussi plausible que l'hypothèse nulle considérant l'absence absolue d'effet. Ceci, même si le concept de recherche non-expérimental mis en oeuvre ne peut pas complètement exclure la sélection et la régression à la moyenne comme explications alternatives. The current research project is both a process and impact evaluation of community policing in Switzerland's five major urban areas - Basel, Bern, Geneva, Lausanne, and Zurich. Community policing is both a philosophy and an organizational strategy that promotes a renewed partnership between the police and the community to solve problems of crime and disorder. The process evaluation data on police internal reforms were obtained through semi-structured interviews with key administrators from the five police departments as well as from police internal documents and additional public sources. The impact evaluation uses official crime records and census statistics as contextual variables as well as Swiss Crime Survey (SCS) data on fear of crime, perceptions of disorder, and public attitudes towards the police as outcome measures. The SCS is a standing survey instrument that has polled residents of the five urban areas repeatedly since the mid-1980s. The process evaluation produced a "Calendar of Action" to create panel data to measure community policing implementation progress over six evaluative dimensions in intervals of five years between 1990 and 2010. The impact evaluation, carried out ex post facto, uses an observational design that analyzes the impact of the different community policing models between matched comparison areas across the five cities. Using ZIP code districts as proxies for urban neighborhoods, geospatial data mining algorithms serve to develop a neighborhood typology in order to match the comparison areas. To this end, both unsupervised and supervised algorithms are used to analyze high-dimensional data on crime, the socio-economic and demographic structure, and the built environment in order to classify urban neighborhoods into clusters of similar type. In a first step, self-organizing maps serve as tools to develop a clustering algorithm that reduces the within-cluster variance in the contextual variables and simultaneously maximizes the between-cluster variance in survey responses. The random forests algorithm then serves to assess the appropriateness of the resulting neighborhood typology and to select the key contextual variables in order to build a parsimonious model that makes a minimum of classification errors. Finally, for the impact analysis, propensity score matching methods are used to match the survey respondents of the pretest and posttest samples on age, gender, and their level of education for each neighborhood type identified within each city, before conducting a statistical test of the observed difference in the outcome measures. Moreover, all significant results were subjected to a sensitivity analysis to assess the robustness of these findings in the face of potential bias due to some unobserved covariates. The study finds that over the last fifteen years, all five police departments have undertaken major reforms of their internal organization and operating strategies and forged strategic partnerships in order to implement community policing. The resulting neighborhood typology reduced the within-cluster variance of the contextual variables and accounted for a significant share of the between-cluster variance in the outcome measures prior to treatment, suggesting that geocomputational methods help to balance the observed covariates and hence to reduce threats to the internal validity of an observational design. Finally, the impact analysis revealed that fear of crime dropped significantly over the 2000-2005 period in the neighborhoods in and around the urban centers of Bern and Zurich. These improvements are fairly robust in the face of bias due to some unobserved covariate and covary temporally and spatially with the implementation of community policing. The alternative hypothesis that the observed reductions in fear of crime were at least in part a result of community policing interventions thus appears at least as plausible as the null hypothesis of absolutely no effect, even if the observational design cannot completely rule out selection and regression to the mean as alternative explanations.
Resumo:
Numerous sources of evidence point to the fact that heterogeneity within the Earth's deep crystalline crust is complex and hence may be best described through stochastic rather than deterministic approaches. As seismic reflection imaging arguably offers the best means of sampling deep crustal rocks in situ, much interest has been expressed in using such data to characterize the stochastic nature of crustal heterogeneity. Previous work on this problem has shown that the spatial statistics of seismic reflection data are indeed related to those of the underlying heterogeneous seismic velocity distribution. As of yet, however, the nature of this relationship has remained elusive due to the fact that most of the work was either strictly empirical or based on incorrect methodological approaches. Here, we introduce a conceptual model, based on the assumption of weak scattering, that allows us to quantitatively link the second-order statistics of a 2-D seismic velocity distribution with those of the corresponding processed and depth-migrated seismic reflection image. We then perform a sensitivity study in order to investigate what information regarding the stochastic model parameters describing crustal velocity heterogeneity might potentially be recovered from the statistics of a seismic reflection image using this model. Finally, we present a Monte Carlo inversion strategy to estimate these parameters and we show examples of its application at two different source frequencies and using two different sets of prior information. Our results indicate that the inverse problem is inherently non-unique and that many different combinations of the vertical and lateral correlation lengths describing the velocity heterogeneity can yield seismic images with the same 2-D autocorrelation structure. The ratio of all of these possible combinations of vertical and lateral correlation lengths, however, remains roughly constant which indicates that, without additional prior information, the aspect ratio is the only parameter describing the stochastic seismic velocity structure that can be reliably recovered.
Resumo:
The objective of this study was to evaluate the efficiency of spatial statistical analysis in the selection of genotypes in a plant breeding program and, particularly, to demonstrate the benefits of the approach when experimental observations are not spatially independent. The basic material of this study was a yield trial of soybean lines, with five check varieties (of fixed effect) and 110 test lines (of random effects), in an augmented block design. The spatial analysis used a random field linear model (RFML), with a covariance function estimated from the residuals of the analysis considering independent errors. Results showed a residual autocorrelation of significant magnitude and extension (range), which allowed a better discrimination among genotypes (increase of the power of statistical tests, reduction in the standard errors of estimates and predictors, and a greater amplitude of predictor values) when the spatial analysis was applied. Furthermore, the spatial analysis led to a different ranking of the genetic materials, in comparison with the non-spatial analysis, and a selection less influenced by local variation effects was obtained.
Resumo:
DnaSP is a software package for a comprehensive analysis of DNA polymorphism data. Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets. Among other features, the newly implemented methods allow for: (i) analyses on multiple data files; (ii) haplotype phasing; (iii) analyses on insertion/deletion polymorphism data; (iv) visualizing sliding window results integrated with available genome annotations in the UCSC browser.
Resumo:
Analysis by reduction is a linguistically motivated method for checking correctness of a sentence. It can be modelled by restarting automata. In this paper we propose a method for learning restarting automata which are strictly locally testable (SLT-R-automata). The method is based on the concept of identification in the limit from positive examples only. Also we characterize the class of languages accepted by SLT-R-automata with respect to the Chomsky hierarchy.
Resumo:
In this paper, we discuss Conceptual Knowledge Discovery in Databases (CKDD) in its connection with Data Analysis. Our approach is based on Formal Concept Analysis, a mathematical theory which has been developed and proven useful during the last 20 years. Formal Concept Analysis has led to a theory of conceptual information systems which has been applied by using the management system TOSCANA in a wide range of domains. In this paper, we use such an application in database marketing to demonstrate how methods and procedures of CKDD can be applied in Data Analysis. In particular, we show the interplay and integration of data mining and data analysis techniques based on Formal Concept Analysis. The main concern of this paper is to explain how the transition from data to knowledge can be supported by a TOSCANA system. To clarify the transition steps we discuss their correspondence to the five levels of knowledge representation established by R. Brachman and to the steps of empirically grounded theory building proposed by A. Strauss and J. Corbin.
Resumo:
Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.