46 resultados para Unicode Common Locale Data Repository
em CentAUR: Central Archive University of Reading - UK
Resumo:
OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
Resumo:
We describe ncWMS, an implementation of the Open Geospatial Consortium’s Web Map Service (WMS) specification for multidimensional gridded environmental data. ncWMS can read data in a large number of common scientific data formats – notably the NetCDF format with the Climate and Forecast conventions – then efficiently generate map imagery in thousands of different coordinate reference systems. It is designed to require minimal configuration from the system administrator and, when used in conjunction with a suitable client tool, provides end users with an interactive means for visualizing data without the need to download large files or interpret complex metadata. It is also used as a “bridging” tool providing interoperability between the environmental science community and users of geographic information systems. ncWMS implements a number of extensions to the WMS standard in order to fulfil some common scientific requirements, including the ability to generate plots representing timeseries and vertical sections. We discuss these extensions and their impact upon present and future interoperability. We discuss the conceptual mapping between the WMS data model and the data models used by gridded data formats, highlighting areas in which the mapping is incomplete or ambiguous. We discuss the architecture of the system and particular technical innovations of note, including the algorithms used for fast data reading and image generation. ncWMS has been widely adopted within the environmental data community and we discuss some of the ways in which the software is integrated within data infrastructures and portals.
Resumo:
In nonhuman species, testosterone is known to have permanent organizing effects early in life that predict later expression of sex differences in brain and behavior. However, in humans, it is still unknown whether such mechanisms have organizing effects on neural sexual dimorphism. In human males, we show that variation in fetal testosterone (FT) predicts later local gray matter volume of specific brain regions in a direction that is congruent with sexual dimorphism observed in a large independent sample of age-matched males and females from the NIH Pediatric MRI Data Repository. Right temporoparietal junction/posterior superior temporal sulcus (RTPJ/pSTS), planum temporale/parietal operculum (PT/PO), and posterior lateral orbitofrontal cortex (plOFC) had local gray matter volume that was both sexually dimorphic and predicted in a congruent direction by FT. That is, gray matter volume in RTPJ/pSTS was greater for males compared to females and was positively predicted by FT. Conversely, gray matter volume in PT/PO and plOFC was greater in females compared to males and was negatively predicted by FT. Subregions of both amygdala and hypothalamus were also sexually dimorphic in the direction of Male > Female, but were not predicted by FT. However, FT positively predicted gray matter volume of a non-sexually dimorphic subregion of the amygdala. These results bridge a long-standing gap between human and nonhuman species by showing that FT acts as an organizing mechanism for the development of regional sexual dimorphism in the human brain.
Resumo:
Plant traits – the morphological, anatomical, physiological, biochemical and phenological characteristics of plants and their organs – determine how primary producers respond to environmental factors, affect other trophic levels, influence ecosystem processes and services and provide a link from species richness to ecosystem functional diversity. Trait data thus represent the raw material for a wide range of research from evolutionary biology, community and functional ecology to biogeography. Here we present the global database initiative named TRY, which has united a wide range of the plant trait research community worldwide and gained an unprecedented buy-in of trait data: so far 93 trait databases have been contributed. The data repository currently contains almost three million trait entries for 69 000 out of the world's 300 000 plant species, with a focus on 52 groups of traits characterizing the vegetative and regeneration stages of the plant life cycle, including growth, dispersal, establishment and persistence. A first data analysis shows that most plant traits are approximately log-normally distributed, with widely differing ranges of variation across traits. Most trait variation is between species (interspecific), but significant intraspecific variation is also documented, up to 40% of the overall variation. Plant functional types (PFTs), as commonly used in vegetation models, capture a substantial fraction of the observed variation – but for several traits most variation occurs within PFTs, up to 75% of the overall variation. In the context of vegetation models these traits would better be represented by state variables rather than fixed parameter values. The improved availability of plant trait data in the unified global database is expected to support a paradigm shift from species to trait-based ecology, offer new opportunities for synthetic plant trait research and enable a more realistic and empirically grounded representation of terrestrial vegetation in Earth system models.
Resumo:
There has been a clear lack of common data exchange semantics for inter-organisational workflow management systems where the research has mainly focused on technical issues rather than language constructs. This paper presents the neutral data exchanges semantics required for the workflow integration within the AXAEDIS framework and presents the mechanism for object discovery from the object repository where little or no knowledge about the object is available. The paper also presents workflow independent integration architecture with the AXAEDIS Framework.
Resumo:
The common GIS-based approach to regional analyses of soil organic carbon (SOC) stocks and changes is to define geographic layers for which unique sets of driving variables are derived, which include land use, climate, and soils. These GIS layers, with their associated attribute data, can then be fed into a range of empirical and dynamic models. Common methodologies for collating and formatting regional data sets on land use, climate, and soils were adopted for the project Assessment of Soil Organic Carbon Stocks and Changes at National Scale (GEFSOC). This permitted the development of a uniform protocol for handling the various input for the dynamic GEFSOC Modelling System. Consistent soil data sets for Amazon-Brazil, the Indo-Gangetic Plains (IGP) of India, Jordan and Kenya, the case study areas considered in the GEFSOC project, were prepared using methodologies developed for the World Soils and Terrain Database (SOTER). The approach involved three main stages: (1) compiling new soil geographic and attribute data in SOTER format; (2) using expert estimates and common sense to fill selected gaps in the measured or primary data; (3) using a scheme of taxonomy-based pedotransfer rules and expert-rules to derive soil parameter estimates for similar soil units with missing soil analytical data. The most appropriate approach varied from country to country, depending largely on the overall accessibility and quality of the primary soil data available in the case study areas. The secondary SOTER data sets discussed here are appropriate for a wide range of environmental applications at national scale. These include agro-ecological zoning, land evaluation, modelling of soil C stocks and changes, and studies of soil vulnerability to pollution. Estimates of national-scale stocks of SOC, calculated using SOTER methods, are presented as a first example of database application. Independent estimates of SOC stocks are needed to evaluate the outcome of the GEFSOC Modelling System for current conditions of land use and climate. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
The wild common bean (Phaseolus vulgaris) is widely but discontinuously distributed from northern Mexico to northern Argentina on both sides of the Isthmus of Panama. Little is known on how the species has reached its current disjunct distribution. In this research, chloroplast DNA polymorphisms in seven non-coding regions were used to study the history of migration of wild P. vulgaris between Mesoamerica and South America. A penalized likelihood analysis was applied to previously published Leguminosae ITS data to estimate divergence times between P. vulgaris and its sister taxa from Mesoamerica, and divergence times of populations within P. vulgaris. Fourteen chloroplast haplotypes were identified by PCR-RFLP and their geographical associations were studied by means of a Nested Clade Analysis and Mantel Tests. The results suggest that the haplotypes are not randomly distributed but occupy discrete parts of the geographic range of the species. The current distribution of haplotypes may be explained by isolation by distance and by at least two migration events between Mesoamerica and South America: one from Mesoamerica to South America and another one from northern South America to Mesoamerica. Age estimates place the divergence of P. vulgaris from its sister taxa from Mesoamerica at or before 1.3 Ma, and divergence of populations from Ecuador-northern Peru at or before 0.6 Ma. As these ages are taken as minimum divergence times, the influence of past events, such as the closure of the Isthmus of Panama and the final uplift of the Andes, on the migration history and population structure of this species cannot be disregarded.
Resumo:
This article demonstrates that the design and nature of agricultural support schemes has an influence on farmers' perception of their level of dependence on agricultural support. While direct aid payments inform farmers about the extent to which they are subsidised, indirect support mechanisms veil the level of subsidisation, and therefore they are not fully aware of the extent to which they are supported. To test this hypothesis, we applied data from a survey of 4,500 farmers in three countries: the United Kingdom, Germany and Portugal. It is demonstrated that indirect support, such as that provided through artificially high consumer prices, gives an illusion of free and competitive markets among farmers. This 'visibility' hypothesis is evaluated against an alternative hypothesis that assumes farmers have complete, or at least a fairly comprehensive level of, information on agricultural support schemes. Our findings show that this alternative hypothesis can be ruled out.
Resumo:
Background: Meta-analyses based on individual patient data (IPD) are regarded as the gold standard for systematic reviews. However, the methods used for analysing and presenting results from IPD meta-analyses have received little discussion. Methods We review 44 IPD meta-analyses published during the years 1999–2001. We summarize whether they obtained all the data they sought, what types of approaches were used in the analysis, including assumptions of common or random effects, and how they examined the effects of covariates. Results: Twenty-four out of 44 analyses focused on time-to-event outcomes, and most analyses (28) estimated treatment effects within each trial and then combined the results assuming a common treatment effect across trials. Three analyses failed to stratify by trial, analysing the data is if they came from a single mega-trial. Only nine analyses used random effects methods. Covariate-treatment interactions were generally investigated by subgrouping patients. Seven of the meta-analyses included data from less than 80% of the randomized patients sought, but did not address the resulting potential biases. Conclusions: Although IPD meta-analyses have many advantages in assessing the effects of health care, there are several aspects that could be further developed to make fuller use of the potential of these time-consuming projects. In particular, IPD could be used to more fully investigate the influence of covariates on heterogeneity of treatment effects, both within and between trials. The impact of heterogeneity, or use of random effects, are seldom discussed. There is thus considerable scope for enhancing the methods of analysis and presentation of IPD meta-analysis.
Resumo:
This paper presents a simple Bayesian approach to sample size determination in clinical trials. It is required that the trial should be large enough to ensure that the data collected will provide convincing evidence either that an experimental treatment is better than a control or that it fails to improve upon control by some clinically relevant difference. The method resembles standard frequentist formulations of the problem, and indeed in certain circumstances involving 'non-informative' prior information it leads to identical answers. In particular, unlike many Bayesian approaches to sample size determination, use is made of an alternative hypothesis that an experimental treatment is better than a control treatment by some specified magnitude. The approach is introduced in the context of testing whether a single stream of binary observations are consistent with a given success rate p(0). Next the case of comparing two independent streams of normally distributed responses is considered, first under the assumption that their common variance is known and then for unknown variance. Finally, the more general situation in which a large sample is to be collected and analysed according to the asymptotic properties of the score statistic is explored. Copyright (C) 2007 John Wiley & Sons, Ltd.
Resumo:
Standardisation of microsatellite allele profiles between laboratories is of fundamental importance to the transferability of genetic fingerprint data and the identification of clonal individuals held at multiple sites. Here we describe two methods of standardisation applied to the microsatellite fingerprinting of 429 Theobroma cacao L. trees representing 345 accessions held in the worlds largest Cocoa Intermediate Quarantine facility: the use of a partial allelic ladder through the production of 46 cloned and sequenced allelic standards (AJ748464 to AJ48509), and the use of standard genotypes selected to display a diverse allelic range. Until now a lack of accurate and transferable identification information has impeded efforts to genetically improve the cocoa crop. To address this need, a global initiative to fingerprint all international cocoa germplasm collections using a common set of 15 microsatellite markers is in progress. Data reported here have been deposited with the International Cocoa Germplasm Database and form the basis of a searchable resource for clonal identification. To our knowledge, this is the first quarantine facility to be completely genotyped using microsatellite markers for the purpose of quality control and clonal identification. Implications of the results for retrospective tracking of labelling errors are briefly explored.
Resumo:
A series of the most common chelators used in magnetic resonance imaging ( MRI) and in radiopharmaceuticals for medical diagnosis and tumour therapy, H(4)dota, H(4)teta, H(8)dotp and H(8)tetp, is examined from a chemical point of view. Differences between 12- and 14-membered tetraazamacrocyclic derivatives with methylcarboxylate and methylphosphonate pendant arms and their chelates with divalent first-series transition metal and trivalent lanthanide ions are discussed on the basis of their thermodynamic stability constants, X- ray structures and theoretical studies.
Resumo:
Knowledge-elicitation is a common technique used to produce rules about the operation of a plant from the knowledge that is available from human expertise. Similarly, data-mining is becoming a popular technique to extract rules from the data available from the operation of a plant. In the work reported here knowledge was required to enable the supervisory control of an aluminium hot strip mill by the determination of mill set-points. A method was developed to fuse knowledge-elicitation and data-mining to incorporate the best aspects of each technique, whilst avoiding known problems. Utilisation of the knowledge was through an expert system, which determined schedules of set-points and provided information to human operators. The results show that the method proposed in this paper was effective in producing rules for the on-line control of a complex industrial process. (C) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Knowledge-elicitation is a common technique used to produce rules about the operation of a plant from the knowledge that is available from human expertise. Similarly, data-mining is becoming a popular technique to extract rules from the data available from the operation of a plant. In the work reported here knowledge was required to enable the supervisory control of an aluminium hot strip mill by the determination of mill set-points. A method was developed to fuse knowledge-elicitation and data-mining to incorporate the best aspects of each technique, whilst avoiding known problems. Utilisation of the knowledge was through an expert system, which determined schedules of set-points and provided information to human operators. The results show that the method proposed in this paper was effective in producing rules for the on-line control of a complex industrial process.