152 resultados para DATASETS


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We examine mid- to late Holocene centennial-scale climate variability in Ireland using proxy data from peatlands, lakes and a speleothem. A high degree of between-record variability is apparent in the proxy data and significant chronological uncertainties are present. However, tephra layers provide a robust tool for correlation and improve the chronological precision of the records. Although we can find no statistically significant coherence in the dataset as a whole, a selection of high-quality peatland water table reconstructions co-vary more than would be expected by chance alone. A locally weighted regression model with bootstrapping can be used to construct a ‘best-estimate’ palaeoclimatic reconstruction from these datasets. Visual comparison and cross-wavelet analysis of peatland water table compilations from Ireland and Northern Britain show that there are some periods of coherence between these records. Some terrestrial palaeoclimatic changes in Ireland appear to coincide with changes in the North Atlantic thermohaline circulation and solar activity. However, these relationships are inconsistent and may be obscured by chronological uncertainties. We conclude by suggesting an agenda for future Holocene climate research in Ireland. ©2013 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we introduce an application of matrix factorization to produce corpus-derived, distributional
models of semantics that demonstrate cognitive plausibility. We find that word representations
learned by Non-Negative Sparse Embedding (NNSE), a variant of matrix factorization, are sparse,
effective, and highly interpretable. To the best of our knowledge, this is the first approach which
yields semantic representation of words satisfying these three desirable properties. Though extensive
experimental evaluations on multiple real-world tasks and datasets, we demonstrate the superiority
of semantic models learned by NNSE over other state-of-the-art baselines.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Our review of paleoclimate information for New Zealand pertaining to the past 30,000 years has identified a general sequence of climatic events, spanning the onset of cold conditions marking the final phase of the Last Glaciation, through to the emergence to full interglacial conditions in the early Holocene. In order to facilitate more detailed assessments of climate variability and any leads or lags in the timing of climate changes across the region, a composite stratotype is proposed for New Zealand. The stratotype is based on terrestrial stratigraphic records and is intended to provide a standard reference for the intercomparison and evaluation of climate proxy records. We nominate a specific stratigraphic type record for each climatic event, using either natural exposure or drill core stratigraphic sections. Type records were selected on thebasis of having very good numerical age control and a clear proxy record. In all cases the main proxy of the type record is subfossil pollen. The type record for the period from ca 30 to ca 18 calendar kiloyears BP (cal. ka BP) is designated in lake-bed sediments from a small morainic kettle lake (Galway tarn) in western South Island. The Galway tarn type record spans a period of full glacial conditions (Last Glacial Coldest Period, LGCP) within the Otira Glaciation, and includes three cold stadials separated by two cool interstadials. The type record for the emergence from glacial conditions following the termination of the Last Glaciation (post-Termination amelioration) is in a core of lake sediments from a maar (Pukaki volcanic crater) in Auckland, northern North Island, and spans from ca 18 to 15.64±0.41 cal. ka BP. The type record for the Lateglacial period is an exposure of interbedded peat and mud at montane Kaipo bog, eastern North Island. In this high-resolution type record, an initial mild period was succeeded at 13.74±0.13 cal. ka BP by a cooler period, which after 12.55±0.14 cal. ka BP gave way to a progressive ascent to full interglacial conditions that were achieved by 11.88±0.18 cal. ka BP. Although a type section is not formally designated for the Holocene Interglacial (11.88±0.18 cal. ka BP to the present day), the sedimentary record of Lake Maratoto on the Waikato lowlands, northwestern North Island, is identified as a prospective type section pending the integration and updating of existing stratigraphic and proxy datasets, and age models. The type records are interconnected by one or more dated tephra layers, the ages of which are derived from Bayesian depositional modelling and OxCal-based calibrations using the IntCal09 dataset. Along with the type sections and the Lake Maratoto record, important, well-dated terrestrial reference records are provided for each climate event. Climate proxies from these reference records include pollen flora, stable isotopes from speleothems, beetle and chironomid fauna, and glacier moraines. The regional composite stratotype provides a benchmark against which to compare other records and proxies. Based on the composite stratotype, we provide an updated climate event stratigraphic classification for the New Zealand region. © 2013 Elsevier Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Modern cancer research often involves large datasets and the use of sophisticated statistical techniques. Together these add a heavy computational load to the analysis, which is often coupled with issues surrounding data accessibility. Connectivity mapping is an advanced bioinformatic and computational technique dedicated to therapeutics discovery and drug re-purposing around differential gene expression analysis. On a normal desktop PC, it is common for the connectivity mapping task with a single gene signature to take >2h to complete using sscMap, a popular Java application that runs on standard CPUs (Central Processing Units). Here, we describe new software, cudaMap, which has been implemented using CUDA C/C++ to harness the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for connectivity mapping.

Results: cudaMap can identify candidate therapeutics from the same signature in just over thirty seconds when using an NVIDIA Tesla C2050 GPU. Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate therapeutics discovery with high throughput. We are able to demonstrate dramatic speed differentials between GPU assisted performance and CPU executions as the computational load increases for high accuracy evaluation of statistical significance.

Conclusion: Emerging 'omics' technologies are constantly increasing the volume of data and information to be processed in all areas of biomedical research. Embracing the multicore functionality of GPUs represents a major avenue of local accelerated computing. cudaMap will make a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. cudaMap is open source and can be freely downloaded from http://purl.oclc.org/NET/cudaMap.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

ABSTRACT

The start of the Upper Wurmian in the Alps was marked by massive fluvioglacial aggradation prior to the arrival of the Central Alpine glaciers. In 1984, the Subcommission on European Quaternary Stratigraphy defined the clay pit of Baumkirchen (in the foreland of the Inn Valley, Austria) as the stratotype for the Middle to Upper Wurmian boundary in the Alps. Key for the selection of this site was its radiocarbon chronology, which still ranks among the most important datasets of this time interval in the Alps. In this study we re-sampled all available original plant specimens and established an accelerator mass spectrometry chronology which supersedes the published 40-year-old chronology. The new data show a much smaller scatter and yielded slightly older conventional radiocarbon dates clustering at ca. 31 C-14 ka BP. When calibrated using INTCAL13 the new data suggest that the sampled interval of 653-681 m in the clay pit was deposited 34-36 cal ka BP. Using two new radiocarbon dates of bone fragments found in the fluvioglacial gravel above the banded clays allows us to constrain the timing of the marked change from lacustrine to fluvioglacial sedimentation to ca. 32-33 cal ka BP, which suggests a possible link to the Heinrich 3 event in the North Atlantic. Copyright (c) 2013 John Wiley & Sons, Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Model selection between competing models is a key consideration in the discovery of prognostic multigene signatures. The use of appropriate statistical performance measures as well as verification of biological significance of the signatures is imperative to maximise the chance of external validation of the generated signatures. Current approaches in time-to-event studies often use only a single measure of performance in model selection, such as logrank test p-values, or dichotomise the follow-up times at some phase of the study to facilitate signature discovery. In this study we improve the prognostic signature discovery process through the application of the multivariate partial Cox model combined with the concordance index, hazard ratio of predictions, independence from available clinical covariates and biological enrichment as measures of signature performance. The proposed framework was applied to discover prognostic multigene signatures from early breast cancer data. The partial Cox model combined with the multiple performance measures were used in both guiding the selection of the optimal panel of prognostic genes and prediction of risk within cross validation without dichotomising the follow-up times at any stage. The signatures were successfully externally cross validated in independent breast cancer datasets, yielding a hazard ratio of 2.55 [1.44, 4.51] for the top ranking signature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

According to the axiomatic literature on consensus methods, the best collective choice by one method of preference aggregation can easily be the worst by another. Are award committees, electorates, managers, online retailers, and web-based recommender systems stuck with an impossibility of rational preference aggregation? We investigate this social choice conundrum for seven social choice methods: Condorcet, Borda, Plurality, Antiplurality, the Single Transferable Vote, Coombs, and Plurality Runoff. We rely on Monte Carlo simulations for theoretical results and on twelve ballot datasets from American Psychological Association (APA) presidential elections for empirical results. Each of these elections provides partial rankings of five candidates from about 13,000 to about 20,000 voters. APA preferences are neither domain-restricted nor generated by an Impartial Culture. We find virtually no trace of a Condorcet paradox. In direct contrast with the classical social choice conundrum, competing consensus methods agree remarkably well, especially on the overall best and worst options. The agreement is also robust under perturbations of the preference prole via resampling, even in relatively small pseudosamples. We also explore prescriptive implications of our findings.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The management of water resources in Ireland prior to the Water Framework Directive (WFD) has focussed on surface water and groundwater as separate entities. A critical element to the successful implementation of the
WFD is to improve our understanding of the interaction between the two and flow mechanisms by which groundwaters discharge to surface waters. An improved understanding of the contribution of groundwater to surface water is required for the classification of groundwater body status and the determination of groundwater quality thresholds. The results of the study will also have a wider application to many areas of the WFD.
A subcommittee of the WFD Groundwater Working Group (GWWG) has been formed to develop a methodology to estimate the groundwater contribution to Irish Rivers. The group has selected a number of analytical techniques to quantify components of stream flow in an Irish context (Master Recession Curve, Unit Hydrograph, Flood Studies Report methodologies and
hydrogeological analytical modelling). The components of stream flow that can be identified include deep groundwater, intermediate and overland. These analyses have been tested on seven pilot catchments that have a variety of hydrogeological settings and have been used to inform and constrain a mathematical model. The mathematical model used was the NAM (NedbØr-AfstrØmnings-Model) rainfall-runoff model which is a module of DHIs MIKE 11 modelling suite. The results from these pilot catchments have been used to develop a decision model based on catchment descriptors from GIS datasets for the selection of NAM parameters. The datasets used include the mapping of aquifers, vulnerability and subsoils, soils, the Digital Terrain Model, CORINE and lakes. The national coverage of the GIS datasets has allowed the extrapolation of the mathematical model to regional catchments across Ireland.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Public policy is expected to be both responsive to societal views and accountable to all citizens. As such, policy is informed, but not governed, by public opinion. Therefore, understanding the attitudes of the public is important, both to help shape and to evaluate policy priorities. In this way, surveys play a potentially important role in the policy making process.

The aim of this paper is to explore the role of survey research in policy making in Northern Ireland, with particular reference to community relations (better known internationally as good relations). In a region which is emerging from 40 years of conflict, community relations is a key policy area.

For more than 20 years, public attitudes to community relations have been recorded and monitored using two key surveys: the Northern Ireland Social Attitudes Survey (1989 to 1996) and the Northern Ireland Life and Times Survey (1998 to present). This paper will illustrate how these important time series datasets have been used to both inform and evaluate government policy in relation to community relations. By using four examples, we will highlight how these survey data have provided key government indicators of community relations, as well as how they have been used by other groups (such as NGOs) within policy consultation debates. Thus, the paper will provide a worked example of the integral, and bi-directional relationship between attitude measurement and policy making.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Human action recognition is an important problem in computer vision, which has been applied to many applications. However, how to learn an accurate and discriminative representation of videos based on the features extracted from videos still remains to be a challenging problem. In this paper, we propose a novel method named low-rank representation based action recognition to recognize human actions. Given a dictionary, low-rank representation aims at finding the lowestrank representation of all data, which can capture the global data structures. According to its characteristics, low-rank representation is robust against noises. Experimental results demonstrate the effectiveness of the proposed approach on several publicly available datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Object tracking is an active research area nowadays due to its importance in human computer interface, teleconferencing and video surveillance. However, reliable tracking of objects in the presence of occlusions, pose and illumination changes is still a challenging topic. In this paper, we introduce a novel tracking approach that fuses two cues namely colour and spatio-temporal motion energy within a particle filter based framework. We conduct a measure of coherent motion over two image frames, which reveals the spatio-temporal dynamics of the target. At the same time, the importance of both colour and motion energy cues is determined in the stage of reliability evaluation. This determination helps maintain the performance of the tracking system against abrupt appearance changes. Experimental results demonstrate that the proposed method outperforms the other state of the art techniques in the used test datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective: to explore maternal energy balance, incorporating free living physical activity and sedentary behaviour, in uncomplicated pregnancies at risk of macrosomia.

Methods: a parallel-group cross-sectional analysis was conducted in healthy pregnant women predicted to deliver infants weighing Z4000 g (study group) or o4000 g (control group). Women were recruited in a 1:1 ratio from antenatal clinics in Northern Ireland. Women wore a SenseWears Body Media Pro3 physical activity armband and completed a food diary for four consecutive days in the third trimester. Physical activity was measured in Metabolic Equivalent of Tasks (METs) where 1 MET¼1 kcal per kilogram of body weight per hour. Analysis of covariance (ANCOVA) was employed using the General Linear Model to adjust for potential confounders.

Findings: of the 112 women recruited, 100 complete datasets were available for analysis. There was no significant difference in energy balance between the two groups. Intensity of free living physical activity (average METs) of women predicted to deliver macrosomic infants (n¼50) was significantly lower than that of women in the control group (n¼50) (1.3 (0.2) METs (mean, standard deviation) versus 1.2 (0.2) METs; difference in means 0.1 METs (95% confidence interval: 0.19, 0.01); p¼0.021). Women predicted to deliver macrosomic infants also spent significantly more time in sedentary behaviour (r1 MET) than the control group (16.1 (2.8) hours versus 13.8 (4.3) hours; 2.0 hours (0.3, 3.7), p¼0.020).

Key conclusions and implications for practice: although there was no association between predicted fetal macrosomia and energy balance, those women predicted to deliver a macrosomic infant exhibited increased sedentary behaviour and reduced physical activity in the third trimester of pregnancy. Professionals caring for women during pregnancy have an important role in promoting and supporting more active lifestyles amongst women who are predicted to deliver a macrosomic infant given the known associated risks.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pollen grains are microscopic so their identification and quantification has, for decades, depended upon human observers using light microscopes: a labour-intensive approach. Modern improvements in computing and imaging hardware and software now bring automation of pollen analyses within reach. In this paper, we provide the first review in over 15 yr of progress towards automation of the part of palynology concerned with counting and classifying pollen, bringing together literature published from a wide spectrum of sources. We
consider which attempts offer the most potential for an automated palynology system for universal application across all fields of research concerned with pollen classification and counting. We discuss what is required to make the datasets of these automated systems as acceptable as those produced by human palynologists, and present suggestions for how automation will generate novel approaches to counting and classifying pollen that have hitherto been unthinkable.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research investigates the relationship between elevated trace elements in soils, stream sediments and stream water and the prevalence of Chronic Kidney Disease (CKD). The study uses a collaboration of datasets provided from the UK Renal Registry Report (UKRR) on patients with renal diseases requiring treatment including Renal Replacement Therapy (RRT), the soil geochemical dataset for Northern Ireland provided by the Tellus Survey, Geological Survey of Northern Ireland (GSNI) and the bioaccessibility of Potentially Toxic Elements (PTEs) from soil samples which were obtained from the Unified Barge Method (UBM). The relationship between these factors derives from the UKRR report which highlights incidence rates of renal impaired patients showing regional variation with cases of unknown aetiology. Studies suggest a potential cause of the large variation and uncertain aetiology is associated with underlying environmental factors such as the oral bioaccessibility of trace elements in the gastrointestinal tract.
As previous research indicates that long term exposure is related to environmental factors, Northern Ireland is ideally placed for this research as people traditionally live in the same location for long periods of time. Exploratory data analysis and multivariate analyses are used to examine the soil, stream sediments and stream water geochemistry data for a range of key elements including arsenic, lead, cadmium and mercury identified from a review of previous renal disease literature. The spatial prevalence of patients with long term CKD is analysed on an area basis. Further work includes cluster analysis to detect areas of low or high incidences of CKD that are significantly correlated in space, Geographical Weighted Regression (GWR) and Poisson kriging to examine locally varying relationship between elevated concentrations of PTEs and the prevalence of CKD.