901 resultados para Data dissemination and sharing
Resumo:
Pós-graduação em Ciência da Informação - FFC
Resumo:
In soil surveys, several sampling systems can be used to define the most representative sites for sample collection and description of soil profiles. In recent years, the conditioned Latin hypercube sampling system has gained prominence for soil surveys. In Brazil, most of the soil maps are at small scales and in paper format, which hinders their refinement. The objectives of this work include: (i) to compare two sampling systems by conditioned Latin hypercube to map soil classes and soil properties; (II) to retrieve information from a detailed scale soil map of a pilot watershed for its refinement, comparing two data mining tools, and validation of the new soil map; and (III) to create and validate a soil map of a much larger and similar area from the extrapolation of information extracted from the existing soil map. Two sampling systems were created by conditioned Latin hypercube and by the cost-constrained conditioned Latin hypercube. At each prospection place, soil classification and measurement of the A horizon thickness were performed. Maps were generated and validated for each sampling system, comparing the efficiency of these methods. The conditioned Latin hypercube captured greater variability of soils and properties than the cost-constrained conditioned Latin hypercube, despite the former provided greater difficulty in field work. The conditioned Latin hypercube can capture greater soil variability and the cost-constrained conditioned Latin hypercube presents great potential for use in soil surveys, especially in areas of difficult access. From an existing detailed scale soil map of a pilot watershed, topographical information for each soil class was extracted from a Digital Elevation Model and its derivatives, by two data mining tools. Maps were generated using each tool. The more accurate of these tools was used for extrapolation of soil information for a much larger and similar area and the generated map was validated. It was possible to retrieve the existing soil map information and apply it on a larger area containing similar soil forming factors, at much low financial cost. The KnowledgeMiner tool for data mining, and ArcSIE, used to create the soil map, presented better results and enabled the use of existing soil map to extract soil information and its application in similar larger areas at reduced costs, which is especially important in development countries with limited financial resources for such activities, such as Brazil.
Resumo:
The use of markers distributed all long the genome may increase the accuracy of the predicted additive genetic value of young animals that are candidates to be selected as reproducers. In commercial herds, due to the cost of genotyping, only some animals are genotyped and procedures, divided in two or three steps, are done in order to include these genomic data in genetic evaluation. However, genomic evaluation may be calculated using one unified step that combines phenotypic data, pedigree and genomics. The aim of the study was to compare a multiple-trait model using only pedigree information with another using pedigree and genomic data. In this study, 9,318 lactations from 3061 buffaloes were used, 384 buffaloes were genotyped using a Illumina bovine chip (Illumina Infinium (R) bovineHD BeadChip). Seven traits were analyzed milk yield (MY), fat yield (FY), protein yield (PY), lactose yield (LY), fat percentage (F%), protein percentage (P%) and somatic cell score (SCSt). Two analyses were done: one using phenotypic and pedigree information (matrix A) and in the other using a matrix based in pedigree and genomic information (one step, matrix H). The (co) variance components were estimated using multiple-trait analysis by Bayesian inference method, applying an animal model, through Gibbs sampling. The model included the fixed effects of contemporary groups (herd-year-calving season), number of milking (2 levels), and age of buffalo at calving as (co) variable (quadratic and linear effect). The additive genetic, permanent environmental, and residual effects were included as random effects in the model. The heritability estimates using matrix A were 0.25, 0.22, 0.26, 0.17, 0.37, 0.42 and 0.26 and using matrix H were 0.25, 0.24, 0.26, 0.18, 0.38, 0.46 and 0.26 for MY, FY, PY, LY, % F, % P and SCCt, respectively. The estimates of the additive genetic effect for the traits were similar in both analyses, but the accuracy were bigger using matrix H (superior to 15% for traits studied). The heritability estimates were moderated indicating genetic gain under selection. The use of genomic information in the analyses increases the accuracy. It permits a better estimation of the additive genetic value of the animals.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Prior studies of phylogenetic relationships among phocoenids based on morphology and molecular sequence data conflict and yield unresolved relationships among species. This study evaluates a comprehensive set of cranial, postcranial, and soft anatomical characters to infer interrelationships among extant species and several well-known fossil phocoenids, using two different methods to analyze polymorphic data: polymorphic coding and frequency step matrix. Our phylogenetic results confirmed phocoenid monophyly. The division of Phocoenidae into two subfamilies previously proposed was rejected, as well as the alliance of the two extinct genera Salumiphocaena and Piscolithax with Phocoena dioptrica and Phocoenoides dalli. Extinct phocoenids are basal to all extant species. We also examined the origin and distribution of porpoises within the context of this phylogenetic framework. Phocoenid phylogeny together with available geologic evidence suggests that the early history of phocoenids was centered in the North Pacific during the middle Miocene, with subsequent dispersal into the southern hemisphere in the middle Pliocene. A cooling period in the Pleistocene allowed dispersal of the southern ancestor of Phocoena sinusinto the North Pacific (Gulf of California).
Resumo:
Mashups are becoming increasingly popular as end users are able to easily access, manipulate, and compose data from several web sources. To support end users, communities are forming around mashup development environments that facilitate sharing code and knowledge. We have observed, however, that end user mashups tend to suffer from several deficiencies, such as inoperable components or references to invalid data sources, and that those deficiencies are often propagated through the rampant reuse in these end user communities. In this work, we identify and specify ten code smells indicative of deficiencies we observed in a sample of 8,051 pipe-like web mashups developed by thousands of end users in the popular Yahoo! Pipes environment. We show through an empirical study that end users generally prefer pipes that lack those smells, and then present eleven specialized refactorings that we designed to target and remove the smells. Our refactorings reduce the complexity of pipes, increase their abstraction, update broken sources of data and dated components, and standardize pipes to fit the community development patterns. Our assessment on the sample of mashups shows that smells are present in 81% of the pipes, and that the proposed refactorings can reduce that number to 16%, illustrating the potential of refactoring to support thousands of end users developing pipe-like mashups.
Resumo:
Last Glacial Maximum simulated sea surface temperature from the Paleo-Climate version of the National Center for Atmospheric Research Coupled Climate Model (NCAR-CCSM) are compared with available reconstructions and data-based products in the tropical and south Atlantic region. Model results are compared to data proxies based on the Multiproxy Approach for the Reconstruction of the Glacial Ocean surface product (MARGO). Results show that the model sea surface temperature is not consistent with the proxy-data in all of the region of interest. Discrepancies are found in the eastern, equatorial and in the high-latitude South Atlantic. The model overestimates the cooling in the southern South Atlantic (near 50 degrees S) shown by the proxy-data. Near the equator, model and proxies are in better agreement. In the eastern part of the equatorial basin the model underestimates the cooling shown by all proxies. A northward shift in the position of the subtropical convergence zone in the simulation suggests a compression or/and an equatorward shift of the subtropical gyre at the surface, consistent with what is observed in the proxy reconstruction. (C) 2008 Elsevier B.V. All rights reserved
Resumo:
This paper provides a brief but comprehensive guide to creating, preparing and dissecting a 'virtual' fossil, using a worked example to demonstrate some standard data processing techniques. Computed tomography (CT) is a 3D imaging modality for producing 'virtual' models of an object on a computer. In the last decade, CT technology has greatly improved, allowing bigger and denser objects to be scanned increasingly rapidly. The technique has now reached a stage where systems can facilitate large-scale, non-destructive comparative studies of extinct fossils and their living relatives. Consequently the main limiting factor in CT-based analyses is no longer scanning, but the hurdles of data processing (see disclaimer). The latter comprises the techniques required to convert a 3D CT volume (stack of digital slices) into a virtual image of the fossil that can be prepared (separated) from the matrix and 'dissected' into its anatomical parts. This technique can be applied to specimens or part of specimens embedded in the rock matrix that until now have been otherwise impossible to visualise. This paper presents a suggested workflow explaining the steps required, using as example a fossil tooth of Sphenacanthus hybodoides (Egerton), a shark from the Late Carboniferous of England. The original NHMUK copyrighted CT slice stack can be downloaded for practice of the described techniques, which include segmentation, rendering, movie animation, stereo-anaglyphy, data storage and dissemination. Fragile, rare specimens and type materials in university and museum collections can therefore be virtually processed for a variety of purposes, including virtual loans, website illustrations, publications and digital collections. Micro-CT and other 3D imaging techniques are increasingly utilized to facilitate data sharing among scientists and on education and outreach projects. Hence there is the potential to usher in a new era of global scientific collaboration and public communication using specimens in museum collections.
Resumo:
Background: Chronic diseases are the leading cause of premature death and disability in the world with overnutrition a primary cause of diet-related ill health. Excess energy intake, saturated fat, sugar, and salt derived from processed foods are a major cause of disease burden. Our objective is to compare the nutritional composition of processed foods between countries, between food companies, and over time. Design: Surveys of processed foods will be done in each participating country using a standardized methodology. Information on the nutrient composition for each product will be sought either through direct chemical analysis, from the product label, or from the manufacturer. Foods will be categorized into 14 groups and 45 categories for the primary analyses which will compare mean levels of nutrients at baseline and over time. Initial commitments to collaboration have been obtained from 21 countries. Conclusions: This collaborative approach to the collation and sharing of data will enable objective and transparent tracking of processed food composition around the world. The information collected will support government and food industry efforts to improve the nutrient composition of processed foods around the world.
Resumo:
In this paper, we propose nonlinear elliptical models for correlated data with heteroscedastic and/or autoregressive structures. Our aim is to extend the models proposed by Russo et al. [22] by considering a more sophisticated scale structure to deal with variations in data dispersion and/or a possible autocorrelation among measurements taken throughout the same experimental unit. Moreover, to avoid the possible influence of outlying observations or to take into account the non-normal symmetric tails of the data, we assume elliptical contours for the joint distribution of random effects and errors, which allows us to attribute different weights to the observations. We propose an iterative algorithm to obtain the maximum-likelihood estimates for the parameters and derive the local influence curvatures for some specific perturbation schemes. The motivation for this work comes from a pharmacokinetic indomethacin data set, which was analysed previously by Bocheng and Xuping [1] under normality.
Resumo:
Abstract Background Facilitating the provision of appropriate health care for immigrant and Aboriginal populations in Canada is critical for maximizing health potential and well-being. Numerous reports describe heightened risks of poor maternal and birth outcomes for immigrant and Aboriginal women. Many of these outcomes may relate to food consumption/practices and thus may be obviated through provision of resources which suit the women's ethnocultural preferences. This project aims to understand ethnocultural food and health practices of Aboriginal and immigrant women, and how these intersect with respect to the legacy of Aboriginal colonialism and to the social contexts of cultural adaptation and adjustment of immigrants. The findings will inform the development of visual tools for health promotion by practitioners. Methods/Design This four-phase study employs a case study design allowing for multiple means of data collection and different units of analysis. Phase 1 consists of a scoping review of the literature. Phases 2 and 3 incorporate pictorial representations of food choices (photovoice in Phase 2) with semi-structured photo-elicited interviews (in Phase 3). The findings from Phases 1-3 and consultations with key stakeholders will generate key understandings for Phase 4, the production of culturally appropriate visual tools. For the scoping review, an emerging methodological framework will be utilized in addition to systematic review guidelines. A research librarian will assist with the search strategy and retrieval of literature. For Phases 2 and 3, recruitment of 20-24 women will be facilitated by team member affiliations at perinatal clinics in one of the city's most diverse neighbourhoods. The interviews will reveal culturally normative practices surrounding maternal food choices and consumption, including how women negotiate these practices within their own worldview and experiences. A structured and comprehensive integrated knowledge translation plan has been formulated. Discussion The findings of this study will provide practitioners with an understanding of the cultural differences that affect women's dietary choices during maternity. We expect that the developed resources will be of immediate use within the women's units and will enhance counseling efforts. Wide dissemination of outputs may have a greater long term impact in the primary and secondary prevention of these high risk conditions.
Resumo:
Abstract Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.
Resumo:
The miniaturization race in the hardware industry aiming at continuous increasing of transistor density on a die does not bring respective application performance improvements any more. One of the most promising alternatives is to exploit a heterogeneous nature of common applications in hardware. Supported by reconfigurable computation, which has already proved its efficiency in accelerating data intensive applications, this concept promises a breakthrough in contemporary technology development. Memory organization in such heterogeneous reconfigurable architectures becomes very critical. Two primary aspects introduce a sophisticated trade-off. On the one hand, a memory subsystem should provide well organized distributed data structure and guarantee the required data bandwidth. On the other hand, it should hide the heterogeneous hardware structure from the end-user, in order to support feasible high-level programmability of the system. This thesis work explores the heterogeneous reconfigurable hardware architectures and presents possible solutions to cope the problem of memory organization and data structure. By the example of the MORPHEUS heterogeneous platform, the discussion follows the complete design cycle, starting from decision making and justification, until hardware realization. Particular emphasis is made on the methods to support high system performance, meet application requirements, and provide a user-friendly programmer interface. As a result, the research introduces a complete heterogeneous platform enhanced with a hierarchical memory organization, which copes with its task by means of separating computation from communication, providing reconfigurable engines with computation and configuration data, and unification of heterogeneous computational devices using local storage buffers. It is distinguished from the related solutions by distributed data-flow organization, specifically engineered mechanisms to operate with data on local domains, particular communication infrastructure based on Network-on-Chip, and thorough methods to prevent computation and communication stalls. In addition, a novel advanced technique to accelerate memory access was developed and implemented.
Resumo:
The Gaia space mission is a major project for the European astronomical community. As challenging as it is, the processing and analysis of the huge data-flow incoming from Gaia is the subject of thorough study and preparatory work by the DPAC (Data Processing and Analysis Consortium), in charge of all aspects of the Gaia data reduction. This PhD Thesis was carried out in the framework of the DPAC, within the team based in Bologna. The task of the Bologna team is to define the calibration model and to build a grid of spectro-photometric standard stars (SPSS) suitable for the absolute flux calibration of the Gaia G-band photometry and the BP/RP spectrophotometry. Such a flux calibration can be performed by repeatedly observing each SPSS during the life-time of the Gaia mission and by comparing the observed Gaia spectra to the spectra obtained by our ground-based observations. Due to both the different observing sites involved and the huge amount of frames expected (≃100000), it is essential to maintain the maximum homogeneity in data quality, acquisition and treatment, and a particular care has to be used to test the capabilities of each telescope/instrument combination (through the “instrument familiarization plan”), to devise methods to keep under control, and eventually to correct for, the typical instrumental effects that can affect the high precision required for the Gaia SPSS grid (a few % with respect to Vega). I contributed to the ground-based survey of Gaia SPSS in many respects: with the observations, the instrument familiarization plan, the data reduction and analysis activities (both photometry and spectroscopy), and to the maintenance of the data archives. However, the field I was personally responsible for was photometry and in particular relative photometry for the production of short-term light curves. In this context I defined and tested a semi-automated pipeline which allows for the pre-reduction of imaging SPSS data and the production of aperture photometry catalogues ready to be used for further analysis. A series of semi-automated quality control criteria are included in the pipeline at various levels, from pre-reduction, to aperture photometry, to light curves production and analysis.
Resumo:
A study of maar-diatreme volcanoes has been perfomed by inversion of gravity and magnetic data. The geophysical inverse problem has been solved by means of the damped nonlinear least-squares method. To ensure stability and convergence of the solution of the inverse problem, a mathematical tool, consisting in data weighting and model scaling, has been worked out. Theoretical gravity and magnetic modeling of maar-diatreme volcanoes has been conducted in order to get information, which is used for a simple rough qualitative and/or quantitative interpretation. The information also serves as a priori information to design models for the inversion and/or to assist the interpretation of inversion results. The results of theoretical modeling have been used to roughly estimate the heights and the dip angles of the walls of eight Eifel maar-diatremes — each taken as a whole. Inversemodeling has been conducted for the Schönfeld Maar (magnetics) and the Hausten-Morswiesen Maar (gravity and magnetics). The geometrical parameters of these maars, as well as the density and magnetic properties of the rocks filling them, have been estimated. For a reliable interpretation of the inversion results, beside the knowledge from theoretical modeling, it was resorted to other tools such like field transformations and spectral analysis for complementary information. Geologic models, based on thesynthesis of the respective interpretation results, are presented for the two maars mentioned above. The results gave more insight into the genesis, physics and posteruptive development of the maar-diatreme volcanoes. A classification of the maar-diatreme volcanoes into three main types has been elaborated. Relatively high magnetic anomalies are indicative of scoria cones embeded within maar-diatremes if they are not caused by a strong remanent component of the magnetization. Smaller (weaker) secondary gravity and magnetic anomalies on the background of the main anomaly of a maar-diatreme — especially in the boundary areas — are indicative for subsidence processes, which probably occurred in the late sedimentation phase of the posteruptive development. Contrary to postulates referring to kimberlite pipes, there exists no generalized systematics between diameter and height nor between geophysical anomaly and the dimensions of the maar-diatreme volcanoes. Although both maar-diatreme volcanoes and kimberlite pipes are products of phreatomagmatism, they probably formed in different thermodynamic and hydrogeological environments. In the case of kimberlite pipes, large amounts of magma and groundwater, certainly supplied by deep and large reservoirs, interacted under high pressure and temperature conditions. This led to a long period phreatomagmatic process and hence to the formation of large structures. Concerning the maar-diatreme and tuff-ring-diatreme volcanoes, the phreatomagmatic process takes place due to an interaction between magma from small and shallow magma chambers (probably segregated magmas) and small amounts of near-surface groundwater under low pressure and temperature conditions. This leads to shorter time eruptions and consequently to structures of smaller size in comparison with kimberlite pipes. Nevertheless, the results show that the diameter to height ratio for 50% of the studied maar-diatremes is around 1, whereby the dip angle of the diatreme walls is similar to that of the kimberlite pipes and lies between 70 and 85°. Note that these numerical characteristics, especially the dip angle, hold for the maars the diatremes of which — estimated by modeling — have the shape of a truncated cone. This indicates that the diatreme can not be completely resolved by inversion.