56 resultados para Compositional data analysis-roots in geosciences


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Single component geochemical maps are the most basic representation of spatial elemental distributions and commonly used in environmental and exploration geochemistry. However, the compositional nature of geochemical data imposes several limitations on how the data should be presented. The problems relate to the constant sum problem (closure), and the inherently multivariate relative information conveyed by compositional data. Well known is, for instance, the tendency of all heavy metals to show lower values in soils with significant contributions of diluting elements (e.g., the quartz dilution effect); or the contrary effect, apparent enrichment in many elements due to removal of potassium during weathering. The validity of classical single component maps is thus investigated, and reasonable alternatives that honour the compositional character of geochemical concentrations are presented. The first recommended such method relies on knowledge-driven log-ratios, chosen to highlight certain geochemical relations or to filter known artefacts (e.g. dilution with SiO2 or volatiles). This is similar to the classical normalisation approach to a single element. The second approach uses the (so called) log-contrasts, that employ suitable statistical methods (such as classification techniques, regression analysis, principal component analysis, clustering of variables, etc.) to extract potentially interesting geochemical summaries. The caution from this work is that if a compositional approach is not used, it becomes difficult to guarantee that any identified pattern, trend or anomaly is not an artefact of the constant sum constraint. In summary the authors recommend a chain of enquiry that involves searching for the appropriate statistical method that can answer the required geological or geochemical question whilst maintaining the integrity of the compositional nature of the data. The required log-ratio transformations should be applied followed by the chosen statistical method. Interpreting the results may require a closer working relationship between statisticians, data analysts and geochemists.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying differential expression of genes in psoriatic and healthy skin by microarray data analysis is a key approach to understand the pathogenesis of psoriasis. Analysis of more than one dataset to identify genes commonly upregulated reduces the likelihood of false positives and narrows down the possible signature genes. Genes controlling the critical balance between T helper 17 and regulatory T cells are of special interest in psoriasis. Our objectives were to identify genes that are consistently upregulated in lesional skin from three published microarray datasets. We carried out a reanalysis of gene expression data extracted from three experiments on samples from psoriatic and nonlesional skin using the same stringency threshold and software and further compared the expression levels of 92 genes related to the T helper 17 and regulatory T cell signaling pathways. We found 73 probe sets representing 57 genes commonly upregulated in lesional skin from all datasets. These included 26 probe sets representing 20 genes that have no previous link to the etiopathogenesis of psoriasis. These genes may represent novel therapeutic targets and surely need more rigorous experimental testing to be validated. Our analysis also identified 12 of 92 genes known to be related to the T helper 17 and regulatory T cell signaling pathways, and these were found to be differentially expressed in the lesional skin samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The predominant fear in capital markets is that of a price spike. Commodity markets differ in that there is a fear of both upward and down jumps, this results in implied volatility curves displaying distinct shapes when compared to equity markets. The use of a novel functional data analysis (FDA) approach, provides a framework to produce and interpret functional objects that characterise the underlying dynamics of oil future options. We use the FDA framework to examine implied volatility, jump risk, and pricing dynamics within crude oil markets. Examining a WTI crude oil sample for the 2007–2013 period, which includes the global financial crisis and the Arab Spring, strong evidence is found of converse jump dynamics during periods of demand and supply side weakness. This is used as a basis for an FDA-derived Merton (1976) jump diffusion optimised delta hedging strategy, which exhibits superior portfolio management results over traditional methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The inclusion of collisional rates for He-like Fe and Ca ions is discussed with reference to the analysis of solar flare Fe XXV and Ca XIX line emission, particularly from the Yohkoh Bragg Crystal Spectrometer (BCS). The new data are a slight improvement on calculations presently used in the BCS analysis software in that the discrepancy in the Fe XXV y and z line intensities (observed larger than predicted) is reduced. Values of electron temperature from satellite-to-resonance line ratios are slightly reduced (by up to 1 MK) for a given observed ratio. The new atomic data will be incorporated in the Yohkoh BCS databases. The data should also be of interest for the analysis of high-resolution, non-solar spectra expected from the Constellation-X and Astro-E space missions. A comparison is made of a tokamak S XV spectrum with a synthetic spectrum using atomic data in the existing software and the agreement is found to be good, so validating these data for particularly high-n satellite wavelengths close to the S XV resonance line. An error in a data file used for analyzing BCS Fe XXVI spectra is corrected, so permitting analysis of these spectra.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Correlation analyses were conducted on nickel (Ni), vanadium (V) and zinc (Zn) oral bioaccessible fractions (BAFs) and selected geochemistry parameters to identify specific controls exerted over trace element bioaccessibility. BAFs were determined by previous research using the unified BARGE method. Total trace element concentrations and soil geochemical parameters were analysed as part of the Geological Survey of Northern Ireland Tellus Project. Correlation analysis included Ni, V and Zn BAFs against their total concentrations, pH, estimated soil organic carbon (SOC) and a further eight element oxides. BAF data were divided into three separate generic bedrock classifications of basalt, lithic arenite and mudstone prior to analysis, resulting in an increase in average correlation coefficients between BAFs and geochemical parameters. Sulphur trioxide and SOC, spatially correlated with upland peat soils, exhibited significant positive correlations with all BAFs in gastric and gastro-intestinal digestion phases, with such effects being strongest in the lithic arenite bedrock group. Significant negative relationships with bioaccessible Ni, V and Zn and their associated total concentrations were observed for the basalt group. Major element oxides were associated with reduced oral trace element bioaccessibility, with Al2O3 resulting in the highest number of significant negative correlations followed by Fe2O3. spatial mapping showed that metal oxides were present at reduced levels in peat soils. The findings illustrate how specific geology and soil geochemistry exert controls over trace element bioaccessibility, with soil chemical factors having a stronger influence on BAF results than relative geogenic abundance. In general, higher Ni, V and Zn bioaccessibility is expected in peat soil types.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Model selection between competing models is a key consideration in the discovery of prognostic multigene signatures. The use of appropriate statistical performance measures as well as verification of biological significance of the signatures is imperative to maximise the chance of external validation of the generated signatures. Current approaches in time-to-event studies often use only a single measure of performance in model selection, such as logrank test p-values, or dichotomise the follow-up times at some phase of the study to facilitate signature discovery. In this study we improve the prognostic signature discovery process through the application of the multivariate partial Cox model combined with the concordance index, hazard ratio of predictions, independence from available clinical covariates and biological enrichment as measures of signature performance. The proposed framework was applied to discover prognostic multigene signatures from early breast cancer data. The partial Cox model combined with the multiple performance measures were used in both guiding the selection of the optimal panel of prognostic genes and prediction of risk within cross validation without dichotomising the follow-up times at any stage. The signatures were successfully externally cross validated in independent breast cancer datasets, yielding a hazard ratio of 2.55 [1.44, 4.51] for the top ranking signature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrospective clinical datasets are often characterized by a relatively small sample size and many missing data. In this case, a common way for handling the missingness consists in discarding from the analysis patients with missing covariates, further reducing the sample size. Alternatively, if the mechanism that generated the missing allows, incomplete data can be imputed on the basis of the observed data, avoiding the reduction of the sample size and allowing methods to deal with complete data later on. Moreover, methodologies for data imputation might depend on the particular purpose and might achieve better results by considering specific characteristics of the domain. The problem of missing data treatment is studied in the context of survival tree analysis for the estimation of a prognostic patient stratification. Survival tree methods usually address this problem by using surrogate splits, that is, splitting rules that use other variables yielding similar results to the original ones. Instead, our methodology consists in modeling the dependencies among the clinical variables with a Bayesian network, which is then used to perform data imputation, thus allowing the survival tree to be applied on the completed dataset. The Bayesian network is directly learned from the incomplete data using a structural expectation–maximization (EM) procedure in which the maximization step is performed with an exact anytime method, so that the only source of approximation is due to the EM formulation itself. On both simulated and real data, our proposed methodology usually outperformed several existing methods for data imputation and the imputation so obtained improved the stratification estimated by the survival tree (especially with respect to using surrogate splits).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Perfect information is seldom available to man or machines due to uncertainties inherent in real world problems. Uncertainties in geographic information systems (GIS) stem from either vague/ambiguous or imprecise/inaccurate/incomplete information and it is necessary for GIS to develop tools and techniques to manage these uncertainties. There is a widespread agreement in the GIS community that although GIS has the potential to support a wide range of spatial data analysis problems, this potential is often hindered by the lack of consistency and uniformity. Uncertainties come in many shapes and forms, and processing uncertain spatial data requires a practical taxonomy to aid decision makers in choosing the most suitable data modeling and analysis method. In this paper, we: (1) review important developments in handling uncertainties when working with spatial data and GIS applications; (2) propose a taxonomy of models for dealing with uncertainties in GIS; and (3) identify current challenges and future research directions in spatial data analysis and GIS for managing uncertainties.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study applies spatial statistical techniques including cokriging to integrate airborne geophysical (radiometric) data with ground-based measurements of peat depth and soil organic carbon (SOC) to monitor change in peat cover for carbon stock calculations. The research is part of the EU funded Tellus Border project and is supported by the INTERREG IVA development programme of the European Regional Development Fund, which is managed by the Special EU Programmes Body (SEUPB). The premise is that saturated peat attenuates the radiometric signal from underlying soils and rocks. Contemporaneous ground-based measurements were collected to corroborate mapped estimates and develop a statistical model for volumetric carbon content (VCC) to 0.5 metres. Field measurements included ground penetrating radar, gamma ray spectrometry and a soil sampling methodology which measured bulk density and soil moisture to determine VCC. One aim of the study was to explore whether airborne radiometric survey data can be used to establish VCC across a region. To account for the footprint of airborne radiometric data, five cores were obtained at each soil sampling location: one at the centre of the ground radiometric equivalent sample location and one at each of the four corners 20 metres apart. This soil sampling strategy replicated the methodology deployed for the Tellus Border geochemistry survey. Two key issues will be discussed from this work. The first addresses the integration of different sampling supports for airborne and ground measured data and the second discusses the compositional nature of the VOC data.