975 resultados para Structured data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: The Agency for Healthcare Research and Quality (AHRQ) developed Patient Safety Indicators (PSIs) for use with ICD-9-CM data. Many countries have adopted ICD-10 for coding hospital diagnoses. We conducted this study to develop an internationally harmonized ICD-10 coding algorithm for the AHRQ PSIs. Methods: The AHRQ PSI Version 2.1 has been translated into ICD-10-AM (Australian Modification), and PSI Version 3.0a has been independently translated into ICD-10-GM (German Modification). We converted these two country-specific coding algorithms into ICD-10-WHO (World Health Organization version) and combined them to form one master list. Members of an international expert panel-including physicians, professional medical coders, disease classification specialists, health services researchers, epidemiologists, and users of the PSI-independently evaluated this master list and rated each code as either "include," "exclude," or "uncertain," following the AHRQ PSI definitions. After summarizing the independent rating results, we held a face-to-face meeting to discuss codes for which there was no unanimous consensus and newly proposed codes. A modified Delphi method was employed to generate a final ICD-10 WHO coding list. Results: Of 20 PSIs, 15 that were based mainly on diagnosis codes were selected for translation. At the meeting, panelists discussed 794 codes for which consensus had not been achieved and 2,541 additional codes that were proposed by individual panelists for consideration prior to the meeting. Three documents were generated: a PSI ICD-10-WHO version-coding list, a list of issues for consideration on certain AHRQ PSIs and ICD-9-CM codes, and a recommendation to WHO to improve specification of some disease classifications. Conclusion: An ICD-10-WHO PSI coding list has been developed and structured in a manner similar to the AHRQ manual. Although face validity of the list has been ensured through a rigorous expert panel assessment, its true validity and applicability should be assessed internationally.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The populations of parasites and infectious agents are most of the time structured in complex hierarchy that lies beyond the classical nested design described by Wright's F-statistics (F(IS), F(ST) and F(IT)). In this note we propose a user-friendly step-by-step notice for using recent software (HierFstat) that computes and test fixation indices for any hierarchical structure. We add some tricks and tips for some special data kind (haploid, single locus), some other procedure (bootstrap over loci) and how to handle crossed factors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In spite of its relative importance in the economy of many countriesand its growing interrelationships with other sectors, agriculture has traditionally been excluded from accounting standards. Nevertheless, to support its Common Agricultural Policy, for years the European Commission has been making an effort to obtain standardized information on the financial performance and condition of farms. Through the Farm Accountancy Data Network (FADN), every year data are gathered from a rotating sample of 60.000 professional farms across all member states. FADN data collection is not structured as an accounting cycle but as an extensive questionnaire. This questionnaire refers to assets, liabilities, revenues and expenses, and seems to try to obtain a "true and fair view" of the financial performance and condition of the farms it surveys. However, the definitions used in the questionnaire and the way data is aggregated often appear flawed from an accounting perspective. The objective of this paper is to contrast the accounting principles implicit in the FADN questionnaire with generally accepted accounting principles, particularly those found in the IVth Directive of the European Union, on the one hand, and those recently proposed by the International Accounting Standards Committee’s Steering Committeeon Agriculture in its Draft Statement of Principles, on the other hand. There are two reasons why this is useful. First, it allows to make suggestions how the information provided by FADN could be more in accordance with the accepted accounting framework, and become a more valuable tool for policy makers, farmers, and other stakeholders. Second, it helps assessing the suitability of FADN to become the starting point for a European accounting standard on agriculture.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Geographical body size variation has long interested evolutionary biologists, and a range of mechanisms have been proposed to explain the observed patterns. It is considered to be more puzzling in ectotherms than in endotherms, and integrative approaches are necessary for testing non-exclusive alternative mechanisms. Using lacertid lizards as a model, we adopted an integrative approach, testing different hypotheses for both sexes while incorporating temporal, spatial, and phylogenetic autocorrelation at the individual level. We used data on the Spanish Sand Racer species group from a field survey to disentangle different sources of body size variation through environmental and individual genetic data, while accounting for temporal and spatial autocorrelation. A variation partitioning method was applied to separate independent and shared components of ecology and phylogeny, and estimated their significance. Then, we fed-back our models by controlling for relevant independent components. The pattern was consistent with the geographical Bergmann's cline and the experimental temperature-size rule: adults were larger at lower temperatures (and/or higher elevations). This result was confirmed with additional multi-year independent data-set derived from the literature. Variation partitioning showed no sex differences in phylogenetic inertia but showed sex differences in the independent component of ecology; primarily due to growth differences. Interestingly, only after controlling for independent components did primary productivity also emerge as an important predictor explaining size variation in both sexes. This study highlights the importance of integrating individual-based genetic information, relevant ecological parameters, and temporal and spatial autocorrelation in sex-specific models to detect potentially important hidden effects. Our individual-based approach devoted to extract and control for independent components was useful to reveal hidden effects linked with alternative non-exclusive hypothesis, such as those of primary productivity. Also, including measurement date allowed disentangling and controlling for short-term temporal autocorrelation reflecting sex-specific growth plasticity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A recent study suggests that sex-specific dispersal rates can be quantitatively estimated on the basis of sex- and state-specific (pre- vs. postdispersal) F-statistics. In the present paper, we extend this approach to account for the hierarchical structure of natural populations, and we validate it through individual-based simulations. The model is applied to an empirical data set consisting of 536 individuals (males, females, and predispersal juveniles) of greater white-toothed shrews (Crocidura russula), sampled according to a hierarchical design and typed for seven autosomal microsatellite loci. From this dataset, dispersal is significantly female biased at the local scale (breeding-group level), but not at the larger scale (among local populations). We argue that selective pressures on dispersal are likely to depend on the spatial scale considered, and that short-distance dispersal should mainly respond to kin interactions (inbreeding or kin competition avoidance), which exert differential pressure on males and females.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La présente étude est à la fois une évaluation du processus de la mise en oeuvre et des impacts de la police de proximité dans les cinq plus grandes zones urbaines de Suisse - Bâle, Berne, Genève, Lausanne et Zurich. La police de proximité (community policing) est à la fois une philosophie et une stratégie organisationnelle qui favorise un partenariat renouvelé entre la police et les communautés locales dans le but de résoudre les problèmes relatifs à la sécurité et à l'ordre public. L'évaluation de processus a analysé des données relatives aux réformes internes de la police qui ont été obtenues par l'intermédiaire d'entretiens semi-structurés avec des administrateurs clés des cinq départements de police, ainsi que dans des documents écrits de la police et d'autres sources publiques. L'évaluation des impacts, quant à elle, s'est basée sur des variables contextuelles telles que des statistiques policières et des données de recensement, ainsi que sur des indicateurs d'impacts construit à partir des données du Swiss Crime Survey (SCS) relatives au sentiment d'insécurité, à la perception du désordre public et à la satisfaction de la population à l'égard de la police. Le SCS est un sondage régulier qui a permis d'interroger des habitants des cinq grandes zones urbaines à plusieurs reprises depuis le milieu des années 1980. L'évaluation de processus a abouti à un « Calendrier des activités » visant à créer des données de panel permettant de mesurer les progrès réalisés dans la mise en oeuvre de la police de proximité à l'aide d'une grille d'évaluation à six dimensions à des intervalles de cinq ans entre 1990 et 2010. L'évaluation des impacts, effectuée ex post facto, a utilisé un concept de recherche non-expérimental (observational design) dans le but d'analyser les impacts de différents modèles de police de proximité dans des zones comparables à travers les cinq villes étudiées. Les quartiers urbains, délimités par zone de code postal, ont ainsi été regroupés par l'intermédiaire d'une typologie réalisée à l'aide d'algorithmes d'apprentissage automatique (machine learning). Des algorithmes supervisés et non supervisés ont été utilisés sur les données à haute dimensionnalité relatives à la criminalité, à la structure socio-économique et démographique et au cadre bâti dans le but de regrouper les quartiers urbains les plus similaires dans des clusters. D'abord, les cartes auto-organisatrices (self-organizing maps) ont été utilisées dans le but de réduire la variance intra-cluster des variables contextuelles et de maximiser simultanément la variance inter-cluster des réponses au sondage. Ensuite, l'algorithme des forêts d'arbres décisionnels (random forests) a permis à la fois d'évaluer la pertinence de la typologie de quartier élaborée et de sélectionner les variables contextuelles clés afin de construire un modèle parcimonieux faisant un minimum d'erreurs de classification. Enfin, pour l'analyse des impacts, la méthode des appariements des coefficients de propension (propensity score matching) a été utilisée pour équilibrer les échantillons prétest-posttest en termes d'âge, de sexe et de niveau d'éducation des répondants au sein de chaque type de quartier ainsi identifié dans chacune des villes, avant d'effectuer un test statistique de la différence observée dans les indicateurs d'impacts. De plus, tous les résultats statistiquement significatifs ont été soumis à une analyse de sensibilité (sensitivity analysis) afin d'évaluer leur robustesse face à un biais potentiel dû à des covariables non observées. L'étude relève qu'au cours des quinze dernières années, les cinq services de police ont entamé des réformes majeures de leur organisation ainsi que de leurs stratégies opérationnelles et qu'ils ont noué des partenariats stratégiques afin de mettre en oeuvre la police de proximité. La typologie de quartier développée a abouti à une réduction de la variance intra-cluster des variables contextuelles et permet d'expliquer une partie significative de la variance inter-cluster des indicateurs d'impacts avant la mise en oeuvre du traitement. Ceci semble suggérer que les méthodes de géocomputation aident à équilibrer les covariables observées et donc à réduire les menaces relatives à la validité interne d'un concept de recherche non-expérimental. Enfin, l'analyse des impacts a révélé que le sentiment d'insécurité a diminué de manière significative pendant la période 2000-2005 dans les quartiers se trouvant à l'intérieur et autour des centres-villes de Berne et de Zurich. Ces améliorations sont assez robustes face à des biais dus à des covariables inobservées et covarient dans le temps et l'espace avec la mise en oeuvre de la police de proximité. L'hypothèse alternative envisageant que les diminutions observées dans le sentiment d'insécurité soient, partiellement, un résultat des interventions policières de proximité semble donc être aussi plausible que l'hypothèse nulle considérant l'absence absolue d'effet. Ceci, même si le concept de recherche non-expérimental mis en oeuvre ne peut pas complètement exclure la sélection et la régression à la moyenne comme explications alternatives. The current research project is both a process and impact evaluation of community policing in Switzerland's five major urban areas - Basel, Bern, Geneva, Lausanne, and Zurich. Community policing is both a philosophy and an organizational strategy that promotes a renewed partnership between the police and the community to solve problems of crime and disorder. The process evaluation data on police internal reforms were obtained through semi-structured interviews with key administrators from the five police departments as well as from police internal documents and additional public sources. The impact evaluation uses official crime records and census statistics as contextual variables as well as Swiss Crime Survey (SCS) data on fear of crime, perceptions of disorder, and public attitudes towards the police as outcome measures. The SCS is a standing survey instrument that has polled residents of the five urban areas repeatedly since the mid-1980s. The process evaluation produced a "Calendar of Action" to create panel data to measure community policing implementation progress over six evaluative dimensions in intervals of five years between 1990 and 2010. The impact evaluation, carried out ex post facto, uses an observational design that analyzes the impact of the different community policing models between matched comparison areas across the five cities. Using ZIP code districts as proxies for urban neighborhoods, geospatial data mining algorithms serve to develop a neighborhood typology in order to match the comparison areas. To this end, both unsupervised and supervised algorithms are used to analyze high-dimensional data on crime, the socio-economic and demographic structure, and the built environment in order to classify urban neighborhoods into clusters of similar type. In a first step, self-organizing maps serve as tools to develop a clustering algorithm that reduces the within-cluster variance in the contextual variables and simultaneously maximizes the between-cluster variance in survey responses. The random forests algorithm then serves to assess the appropriateness of the resulting neighborhood typology and to select the key contextual variables in order to build a parsimonious model that makes a minimum of classification errors. Finally, for the impact analysis, propensity score matching methods are used to match the survey respondents of the pretest and posttest samples on age, gender, and their level of education for each neighborhood type identified within each city, before conducting a statistical test of the observed difference in the outcome measures. Moreover, all significant results were subjected to a sensitivity analysis to assess the robustness of these findings in the face of potential bias due to some unobserved covariates. The study finds that over the last fifteen years, all five police departments have undertaken major reforms of their internal organization and operating strategies and forged strategic partnerships in order to implement community policing. The resulting neighborhood typology reduced the within-cluster variance of the contextual variables and accounted for a significant share of the between-cluster variance in the outcome measures prior to treatment, suggesting that geocomputational methods help to balance the observed covariates and hence to reduce threats to the internal validity of an observational design. Finally, the impact analysis revealed that fear of crime dropped significantly over the 2000-2005 period in the neighborhoods in and around the urban centers of Bern and Zurich. These improvements are fairly robust in the face of bias due to some unobserved covariate and covary temporally and spatially with the implementation of community policing. The alternative hypothesis that the observed reductions in fear of crime were at least in part a result of community policing interventions thus appears at least as plausible as the null hypothesis of absolutely no effect, even if the observational design cannot completely rule out selection and regression to the mean as alternative explanations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The emergence of powerful new technologies, the existence of large quantities of data, and increasing demands for the extraction of added value from these technologies and data have created a number of significant challenges for those charged with both corporate and information technology management. The possibilities are great, the expectations high, and the risks significant. Organisations seeking to employ cloud technologies and exploit the value of the data to which they have access, be this in the form of "Big Data" available from different external sources or data held within the organisation, in structured or unstructured formats, need to understand the risks involved in such activities. Data owners have responsibilities towards the subjects of the data and must also, frequently, demonstrate that they are in compliance with current standards, laws and regulations. This thesis sets out to explore the nature of the technologies that organisations might utilise, identify the most pertinent constraints and risks, and propose a framework for the management of data from discovery to external hosting that will allow the most significant risks to be managed through the definition, implementation, and performance of appropriate internal control activities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present experiments in which the laterally confined flow of a surfactant film driven by controlled surface tension gradients causes the subtended liquid layer to self-organize into an inner upstream microduct surrounded by the downstream flow. The anomalous interfacial flow profiles and the concomitant backflow are a result of the feedback between two-dimensional and three-dimensional microfluidics realized during flow in open microchannels. Bulk and surface particle image velocimetry data combined with an interfacial hydrodynamics model explain the dependence of the observed phenomena on channel geometry.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel formulation to solve the problem of intra-voxel reconstruction of the fibre orientation distribution function (FOD) in each voxel of the white matter of the brain from diffusion MRI data. The majority of the state-of-the-art methods in the field perform the reconstruction on a voxel-by-voxel level, promoting sparsity of the orientation distribution. Recent methods have proposed a global denoising of the diffusion data using spatial information prior to reconstruction, while others promote spatial regularisation through an additional empirical prior on the diffusion image at each q-space point. Our approach reconciles voxelwise sparsity and spatial regularisation and defines a spatially structured FOD sparsity prior, where the structure originates from the spatial coherence of the fibre orientation between neighbour voxels. The method is shown, through both simulated and real data, to enable accurate FOD reconstruction from a much lower number of q-space samples than the state of the art, typically 15 samples, even for quite adverse noise conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the problem of multiple correlated sparse signals reconstruction and propose a new implementation of structured sparsity through a reweighting scheme. We present a particular application for diffusion Magnetic Resonance Imaging data and show how this procedure can be used for fibre orientation reconstruction in the white matter of the brain. In that framework, our structured sparsity prior can be used to exploit the fundamental coherence between fibre directions in neighbour voxels. Our method approaches the ℓ0 minimisation through a reweighted ℓ1-minimisation scheme. The weights are here defined in such a way to promote correlated sparsity between neighbour signals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lattice valued fuzziness is more general than crispness or fuzziness based on the unit interval. In this work, we present a query language for a lattice based fuzzy database. We define a Lattice Fuzzy Structured Query Language (LFSQL) taking its membership values from an arbitrary lattice L. LFSQL can handle, manage and represent crisp values, linear ordered membership degrees and also allows membership degrees from lattices with non-comparable values. This gives richer membership degrees, and hence makes LFSQL more flexible than FSQL or SQL. In order to handle vagueness or imprecise information, every entry into an L-fuzzy database is an L-fuzzy set instead of crisp values. All of this makes LFSQL an ideal query language to handle imprecise data where some factors are non-comparable. After defining the syntax of the language formally, we provide its semantics using L-fuzzy sets and relations. The semantics can be used in future work to investigate concepts such as functional dependencies. Last but not least, we present a parser for LFSQL implemented in Haskell.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Geochemical data that is derived from the whole or partial analysis of various geologic materials represent a composition of mineralogies or solute species. Minerals are composed of structured relationships between cations and anions which, through atomic and molecular forces, keep the elements bound in specific configurations. The chemical compositions of minerals have specific relationships that are governed by these molecular controls. In the case of olivine, there is a well-defined relationship between Mn-Fe-Mg with Si. Balances between the principal elements defining olivine composition and other significant constituents in the composition (Al, Ti) have been defined, resulting in a near-linear relationship between the logarithmic relative proportion of Si versus (MgMnFe) and Mg versus (MnFe), which is typically described but poorly illustrated in the simplex. The present contribution corresponds to ongoing research, which attempts to relate stoichiometry and geochemical data using compositional geometry. We describe here the approach by which stoichiometric relationships based on mineralogical constraints can be accounted for in the space of simplicial coordinates using olivines as an example. Further examples for other mineral types (plagioclases and more complex minerals such as clays) are needed. Issues that remain to be dealt with include the reduction of a bulk chemical composition of a rock comprised of several minerals from which appropriate balances can be used to describe the composition in a realistic mineralogical framework. The overall objective of our research is to answer the question: In the cases where the mineralogy is unknown, are there suitable proxies that can be substituted? Kew words: Aitchison geometry, balances, mineral composition, oxides