29 resultados para data complexity
Resumo:
Developing models to predict the effects of social and economic change on agricultural landscapes is an important challenge. Model development often involves making decisions about which aspects of the system require detailed description and which are reasonably insensitive to the assumptions. However, important components of the system are often left out because parameter estimates are unavailable. In particular, measurements of the relative influence of different objectives, such as risk, environmental management, on farmer decision making, have proven difficult to quantify. We describe a model that can make predictions of land use on the basis of profit alone or with the inclusion of explicit additional objectives. Importantly, our model is specifically designed to use parameter estimates for additional objectives obtained via farmer interviews. By statistically comparing the outputs of this model with a large farm-level land-use data set, we show that cropping patterns in the United Kingdom contain a significant contribution from farmer’s preference for objectives other than profit. In particular, we found that risk aversion had an effect on the accuracy of model predictions, whereas preference for a particular number of crops grown was less important. While nonprofit objectives have frequently been identified as factors in farmers’ decision making, our results take this analysis further by demonstrating the relationship between these preferences and actual cropping patterns.
Resumo:
The bewildering complexity of cortical microcircuits at the single cell level gives rise to surprisingly robust emergent activity patterns at the level of laminar and columnar local field potentials (LFPs) in response to targeted local stimuli. Here we report the results of our multivariate data-analytic approach based on simultaneous multi-site recordings using micro-electrode-array chips for investigation of the microcircuitary of rat somatosensory (barrel) cortex. We find high repeatability of stimulus-induced responses, and typical spatial distributions of LFP responses to stimuli in supragranular, granular, and infragranular layers, where the last form a particularly distinct class. Population spikes appear to travel with about 33 cm/s from granular to infragranular layers. Responses within barrel related columns have different profiles than those in neighbouring columns to the left or interchangeably to the right. Variations between slices occur, but can be minimized by strictly obeying controlled experimental protocols. Cluster analysis on normalized recordings indicates specific spatial distributions of time series reflecting the location of sources and sinks independent of the stimulus layer. Although the precise correspondences between single cell activity and LFPs are still far from clear, a sophisticated neuroinformatics approach in combination with multi-site LFP recordings in the standardized slice preparation is suitable for comparing normal conditions to genetically or pharmacologically altered situations based on real cortical microcircuitry.
First order k-th moment finite element analysis of nonlinear operator equations with stochastic data
Resumo:
We develop and analyze a class of efficient Galerkin approximation methods for uncertainty quantification of nonlinear operator equations. The algorithms are based on sparse Galerkin discretizations of tensorized linearizations at nominal parameters. Specifically, we consider abstract, nonlinear, parametric operator equations J(\alpha ,u)=0 for random input \alpha (\omega ) with almost sure realizations in a neighborhood of a nominal input parameter \alpha _0. Under some structural assumptions on the parameter dependence, we prove existence and uniqueness of a random solution, u(\omega ) = S(\alpha (\omega )). We derive a multilinear, tensorized operator equation for the deterministic computation of k-th order statistical moments of the random solution's fluctuations u(\omega ) - S(\alpha _0). We introduce and analyse sparse tensor Galerkin discretization schemes for the efficient, deterministic computation of the k-th statistical moment equation. We prove a shift theorem for the k-point correlation equation in anisotropic smoothness scales and deduce that sparse tensor Galerkin discretizations of this equation converge in accuracy vs. complexity which equals, up to logarithmic terms, that of the Galerkin discretization of a single instance of the mean field problem. We illustrate the abstract theory for nonstationary diffusion problems in random domains.
Resumo:
Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.
Resumo:
An extensive off-line evaluation of the Noah/Single Layer Urban Canopy Model (Noah/SLUCM) urban land-surface model is presented using data from 15 sites to assess (1) the ability of the scheme to reproduce the surface energy balance observed in a range of urban environments, including seasonal changes, and (2) the impact of increasing complexity of input parameter information. Model performance is found to be most dependent on representation of vegetated surface area cover; refinement of other parameter values leads to smaller improvements. Model biases in net all-wave radiation and trade-offs between turbulent heat fluxes are highlighted using an optimization algorithm. Here we use the Urban Zones to characterize Energy partitioning (UZE) as the basis to assign default SLUCM parameter values. A methodology (FRAISE) to assign sites (or areas) to one of these categories based on surface characteristics is evaluated. Using three urban sites from the Basel Urban Boundary Layer Experiment (BUBBLE) dataset, an independent evaluation of the model performance with the parameter values representative of each class is performed. The scheme copes well with both seasonal changes in the surface characteristics and intra-urban heterogeneities in energy flux partitioning, with RMSE performance comparable to similar state-of-the-art models for all fluxes, sites and seasons. The potential of the methodology for high-resolution atmospheric modelling application using the Weather Research and Forecasting (WRF) model is highlighted. This analysis supports the recommendations that (1) three classes are appropriate to characterize the urban environment, and (2) that the parameter values identified should be adopted as default values in WRF.
Resumo:
Greater self-complexity has been suggested as a protective factor for people under stress (Linville, 1985). Two different measures have been proposed to assess individual self-complexity: Attneave’s H statistic (1959) and a composite index of two components of self-complexity (SC; Rafaeli-Mor et al., 1999). Using mood-incongruent recall, i.e., recalling positive events while in negative mood, the present study compared validity of the two measures through reanalysis of Sakaki’s (2004) data. Results indicated that H statistic did not predict performance of mood-incongruent recall. In contrast, greater SC was associated with better mood-incongruent recall even when the effect of H statistic was controlled.
Resumo:
A framework for understanding the complexity of cancer development was established by Hanahan and Weinberg in their definition of the hallmarks of cancer. In this review, we consider the evidence that parabens can enable development in human breast epithelial cells of 4/6 of the basic hallmarks, 1/2 of the emerging hallmarks and 1/2 of the enabling characteristics. Hallmark 1: parabens have been measured as present in 99% of human breast tissue samples, possess oestrogenic activity and can stimulate sustained proliferation of human breast cancer cells at concentrations measurable in the breast. Hallmark 2: parabens can inhibit the suppression of breast cancer cell growth by hydroxytamoxifen, and through binding to the oestrogen-related receptor gamma (ERR) may prevent its deactivation by growth inhibitors. Hallmark 3: in the 10nM to 1M range, parabens give a dose-dependent evasion of apoptosis in high-risk donor breast epithelial cells. Hallmark 4: long-term exposure (>20weeks) to parabens leads to increased migratory and invasive activity in human breast cancer cells, properties which are linked to the metastatic process. Emerging hallmark: methylparaben has been shown in human breast epithelial cells to increase mTOR, a key regulator of energy metabolism. Enabling characteristic: parabens can cause DNA damage at high concentrations in the short term but more work is needed to investigate long-term low-doses of mixtures. The ability of parabens to enable multiple cancer hallmarks in human breast epithelial cells provides grounds for regulatory review of the implications of the presence of parabens in human breast tissue.
Resumo:
Many of the next generation of global climate models will include aerosol schemes which explicitly simulate the microphysical processes that determine the particle size distribution. These models enable aerosol optical properties and cloud condensation nuclei (CCN) concentrations to be determined by fundamental aerosol processes, which should lead to a more physically based simulation of aerosol direct and indirect radiative forcings. This study examines the global variation in particle size distribution simulated by 12 global aerosol microphysics models to quantify model diversity and to identify any common biases against observations. Evaluation against size distribution measurements from a new European network of aerosol supersites shows that the mean model agrees quite well with the observations at many sites on the annual mean, but there are some seasonal biases common to many sites. In particular, at many of these European sites, the accumulation mode number concentration is biased low during winter and Aitken mode concentrations tend to be overestimated in winter and underestimated in summer. At high northern latitudes, the models strongly underpredict Aitken and accumulation particle concentrations compared to the measurements, consistent with previous studies that have highlighted the poor performance of global aerosol models in the Arctic. In the marine boundary layer, the models capture the observed meridional variation in the size distribution, which is dominated by the Aitken mode at high latitudes, with an increasing concentration of accumulation particles with decreasing latitude. Considering vertical profiles, the models reproduce the observed peak in total particle concentrations in the upper troposphere due to new particle formation, although modelled peak concentrations tend to be biased high over Europe. Overall, the multi-model-mean data set simulates the global variation of the particle size distribution with a good degree of skill, suggesting that most of the individual global aerosol microphysics models are performing well, although the large model diversity indicates that some models are in poor agreement with the observations. Further work is required to better constrain size-resolved primary and secondary particle number sources, and an improved understanding of nucleation and growth (e.g. the role of nitrate and secondary organics) will improve the fidelity of simulated particle size distributions.
Resumo:
Land cover plays a key role in global to regional monitoring and modeling because it affects and is being affected by climate change and thus became one of the essential variables for climate change studies. National and international organizations require timely and accurate land cover information for reporting and management actions. The North American Land Change Monitoring System (NALCMS) is an international cooperation of organizations and entities of Canada, the United States, and Mexico to map land cover change of North America's changing environment. This paper presents the methodology to derive the land cover map of Mexico for the year 2005 which was integrated in the NALCMS continental map. Based on a time series of 250 m Moderate Resolution Imaging Spectroradiometer (MODIS) data and an extensive sample data base the complexity of the Mexican landscape required a specific approach to reflect land cover heterogeneity. To estimate the proportion of each land cover class for every pixel several decision tree classifications were combined to obtain class membership maps which were finally converted to a discrete map accompanied by a confidence estimate. The map yielded an overall accuracy of 82.5% (Kappa of 0.79) for pixels with at least 50% map confidence (71.3% of the data). An additional assessment with 780 randomly stratified samples and primary and alternative calls in the reference data to account for ambiguity indicated 83.4% overall accuracy (Kappa of 0.80). A high agreement of 83.6% for all pixels and 92.6% for pixels with a map confidence of more than 50% was found for the comparison between the land cover maps of 2005 and 2006. Further wall-to-wall comparisons to related land cover maps resulted in 56.6% agreement with the MODIS land cover product and a congruence of 49.5 with Globcover.
Resumo:
Solving pharmaceutical crystal structures from powder diffraction data is discussed in terms of the methodologies that have been applied and the complexity of the structures that have been solved. The principles underlying these methodologies are summarized and representative examples of polymorph, solvate, salt and cocrystal structure solutions are provided, together with examples of some particularly challenging structure determinations.
Resumo:
For users of climate services, the ability to quickly determine the datasets that best fit one's needs would be invaluable. The volume, variety and complexity of climate data makes this judgment difficult. The ambition of CHARMe ("Characterization of metadata to enable high-quality climate services") is to give a wider interdisciplinary community access to a range of supporting information, such as journal articles, technical reports or feedback on previous applications of the data. The capture and discovery of this "commentary" information, often created by data users rather than data providers, and currently not linked to the data themselves, has not been significantly addressed previously. CHARMe applies the principles of Linked Data and open web standards to associate, record, search and publish user-derived annotations in a way that can be read both by users and automated systems. Tools have been developed within the CHARMe project that enable annotation capability for data delivery systems already in wide use for discovering climate data. In addition, the project has developed advanced tools for exploring data and commentary in innovative ways, including an interactive data explorer and comparator ("CHARMe Maps") and a tool for correlating climate time series with external "significant events" (e.g. instrument failures or large volcanic eruptions) that affect the data quality. Although the project focuses on climate science, the concepts are general and could be applied to other fields. All CHARMe system software is open-source, released under a liberal licence, permitting future projects to re-use the source code as they wish.
Resumo:
With the increase in e-commerce and the digitisation of design data and information,the construction sector has become reliant upon IT infrastructure and systems. The design and production process is more complex, more interconnected, and reliant upon greater information mobility, with seamless exchange of data and information in real time. Construction small and medium-sized enterprises (CSMEs), in particular,the speciality contractors, can effectively utilise cost-effective collaboration-enabling technologies, such as cloud computing, to help in the effective transfer of information and data to improve productivity. The system dynamics (SD) approach offers a perspective and tools to enable a better understanding of the dynamics of complex systems. This research focuses upon system dynamics methodology as a modelling and analysis tool in order to understand and identify the key drivers in the absorption of cloud computing for CSMEs. The aim of this paper is to determine how the use of system dynamics (SD) can improve the management of information flow through collaborative technologies leading to improved productivity. The data supporting the use of system dynamics was obtained through a pilot study consisting of questionnaires and interviews from five CSMEs in the UK house-building sector.
Resumo:
Conceptualisations of disability that emphasise the contextual and cultural nature of disability and the embodiment of these within a national system of data collection present a number of challenges especially where this process is devolved to schools. The requirement for measures based on contextual and subjective experiences gives rise to particular difficulties in achieving parity in the way data is analysed and reported. This paper presents an account of the testing of a tool intended for use by schools as they collect data from parents to identify children who meet the criteria of disability established in Disability Discrimination Acts (DDAs). Data were validated through interviews with parents and teachers and observations of children and highlighted the pivotal role of the criterion of impact. The findings are set in the context of schools meeting their legal duties to identify disabled children and their support needs in a way that captures the complexity of disabled children’s school lives and provides useful and useable data.
Resumo:
The size and complexity of data sets generated within ecosystem-level programmes merits their capture, curation, storage and analysis, synthesis and visualisation using Big Data approaches. This review looks at previous attempts to organise and analyse such data through the International Biological Programme and draws on the mistakes made and the lessons learned for effective Big Data approaches to current Research Councils United Kingdom (RCUK) ecosystem-level programmes, using Biodiversity and Ecosystem Service Sustainability (BESS) and Environmental Virtual Observatory Pilot (EVOp) as exemplars. The challenges raised by such data are identified, explored and suggestions are made for the two major issues of extending analyses across different spatio-temporal scales and for the effective integration of quantitative and qualitative data.