14 resultados para data availability

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background. The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Results. Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Conclusions. Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable relationships, the recognition of five regions in need of further study is a significant outcome of this work. Based on our analyses of current availability and future requirements of data, we make clear recommendations for forthcoming research.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper aims to contribute to the ongoing debate on the use of resource accounting tools in regional policy making. The Northern Limits project applied Material Flow Analysis and Ecological Footprinting to regional policy making in Northern Ireland over a number of years. The early phase of the research informed the regions first sustainable development strategy which was published in 2006 with key targets relating to the Ecological Footprint and improving the resource efficiency of the economy. Phase II identified the next steps required to address data availability and quality and the use of MFA and EF in providing a measurement and monitoring framework for the strategy and in the development of the strategy implementation plan. The use of MFA and Ecological Footprinting in sustainable regional policy making and the monitoring of its implementation is an ongoing process which has raised a number of research issues which can inform the ongoing application and development of these and other resource accounting tools to within Northern Ireland, provide insights for their use in other regions and help set out the priorities for research to support this important policy area.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes the methodology, results and limitations of the 2013 International Diabetes Federation (IDF) Atlas (6th edition) estimates of the worldwide numbers of prevalent cases of type 1 diabetes in children (<15 years). The majority of relevant information in the published literature is in the form of incidence rates derived from registers of newly diagnosed cases. Studies were graded on quality criteria and, if no information was available in the published literature, extrapolation was used to assign a country the rate from an adjacent country with similar characteristics. Prevalence rates were then derived from these incidence rates and applied to United Nations 2012 Revision population estimates for 2013 for each country to obtain estimates of the number of prevalent cases. Data availability was highest for the countries in Europe (76%) and lowest for the countries in sub-Saharan Africa (8%). The prevalence estimates indicate that there are almost 500,000 children aged under 15 years with type 1 diabetes worldwide, the largest numbers being in Europe (129,000) and North America (108,700). Countries with the highest estimated numbers of new cases annually were the United States (13,000), India (10,900) and Brazil (5000). Compared with the prevalence estimates made in previous editions of the IDF Diabetes Atlas, the numbers have increased in most of the IDF Regions, often reflecting the incidence rate increases that have been well-documented in many countries. Monogenic diabetes is increasingly being recognised among those with clinical features of type 1 or type 2 diabetes as genetic studies become available, but population-based data on incidence and prevalence show wide variation due to lack of standardisation in the studies. Similarly, studies on type 2 diabetes in childhood suggest increased incidence and prevalence in many countries, especially in Indigenous peoples and ethnic minorities, but detailed population-based studies remain limited.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent experimental neutron diffraction data and ab initio molecular dynamics simulation of the ionic liquid dimethylimidazolium chloride ([dmim]Cl) have provided a structural description of the system at the molecular level. However, partial radial distribution functions calculated from the latter, when compared to previous classical simulation results, highlight some limitations in the structural description offered by force fieldbased simulations. With the availability of ab initio data it is possible to improve the classical description of [dmim]Cl by using the force matching approach, and the strategy for fitting complex force fields in their original functional form is discussed. A self-consistent optimization method for the generation of classical potentials of general functional form is presented and applied, and a force field that better reproduces the observed first principles forces is obtained. When used in simulation, it predicts structural data which reproduces more faithfully that observed in the ab initio studies. Some possible refinements to the technique, its application, and the general suitability of common potential energy functions used within many ionic liquid force fields are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper examines the stability of the benefit transfer function across 42 recreational forests in the British Isles. A working definition of reliable function transfer is Put forward, and a suitable statistical test is provided. A novel split sample method is used to test the sensitivity of the models' log-likelihood values to the removal of contingent valuation (CV) responses collected at individual forest sites, We find that a stable function improves Our measure of transfer reliability, but not by much. We conclude that, in empirical Studies on transferability, considerations of function stability are secondary to the availability and quality of site attribute data. Modellers' can study the advantages of transfer function stability vis-a-vis the value of additional information on recreation site attributes. (c) 2008 Elsevier GmbH. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Flutter prediction as currently practiced is usually deterministic, with a single structural model used to represent an aircraft. By using interval analysis to take into account structural variability, recent work has demonstrated that small changes in the structure can lead to very large changes in the altitude at which
utter occurs (Marques, Badcock, et al., J. Aircraft, 2010). In this follow-up work we examine the same phenomenon using probabilistic collocation (PC), an uncertainty quantification technique which can eficiently propagate multivariate stochastic input through a simulation code,
in this case an eigenvalue-based fluid-structure stability code. The resulting analysis predicts the consequences of an uncertain structure on incidence of
utter in probabilistic terms { information that could be useful in planning
flight-tests and assessing the risk of structural failure. The uncertainty in
utter altitude is confirmed to be substantial. Assuming that the structural uncertainty represents a epistemic uncertainty regarding the
structure, it may be reduced with the availability of additional information { for example aeroelastic response data from a flight-test. Such data is used to update the structural uncertainty using Bayes' theorem. The consequent
utter uncertainty is significantly reduced across the entire Mach number range.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent years have witnessed an incredibly increasing interest in the topic of incremental learning. Unlike conventional machine learning situations, data flow targeted by incremental learning becomes available continuously over time. Accordingly, it is desirable to be able to abandon the traditional assumption of the availability of representative training data during the training period to develop decision boundaries. Under scenarios of continuous data flow, the challenge is how to transform the vast amount of stream raw data into information and knowledge representation, and accumulate experience over time to support future decision-making process. In this paper, we propose a general adaptive incremental learning framework named ADAIN that is capable of learning from continuous raw data, accumulating experience over time, and using such knowledge to improve future learning and prediction performance. Detailed system level architecture and design strategies are presented in this paper. Simulation results over several real-world data sets are used to validate the effectiveness of this method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

EUROCHIP (European Cancer Health Indicators Project) focuses on understanding inequalities in the cancer burden, care and survival by the indicators "stage at diagnosis," "cancer treatment delay" and "compliance with cancer guidelines" as the most important indicators. Our study aims at providing insight in whether cancer registries collect well-defined variables to determine these indicators in a comparative way. Eighty-six general European population-based cancer registries (PBCR) from 32 countries responded to the questionnaire, which was developed by EUROCHIP in collaboration with ENCR (European Network of Cancer Registries) and EUROCOURSE. Only 15% of all the PBCR in EU had all three indicators available. The indicator "stage at diagnosis" was gathered for at least one cancer site by 81% (using TNM in 39%). Variables for the indicator "cancer treatment delay" were collected by 37%. Availability of type of treatment (30%), surgery date (36%), starting date of radiotherapy (26%) and starting date of chemotherapy (23%) resulted in 15% of the PBCRs to be able to gather the indicator "compliance to guidelines". Lack of data source access and qualified staff were the major reasons for not collecting all the variables. In conclusion, based on self-reporting, a few of the participating PBCRs had data available which could be used for clinical audits, evaluation of cancer care projects, survival and for monitoring national cancer control strategies. Extra efforts should be made to improve this very efficient tool to compare cancer burden and the effects of the national cancer plans over Europe and to learn from each other. © 2012 UICC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: Several surveillance definitions of influenza-like illness (ILI) have been proposed, based on the presence of symptoms. Symptom data can be obtained from patients, medical records, or both. Past research has found that agreements between health record data and self-report are variable depending on the specific symptom. Therefore, we aimed to explore the implications of using data on influenza symptoms extracted from medical records, similar data collected prospectively from outpatients, and the combined data from both sources as predictors of laboratory-confirmed influenza. Methods: Using data from the Hutterite Influenza Prevention Study, we calculated: 1) the sensitivity, specificity and predictive values of individual symptoms within surveillance definitions; 2) how frequently surveillance definitions correlated to laboratory-confirmed influenza; and 3) the predictive value of surveillance definitions. Results: Of the 176 participants with reports from participants and medical records, 142 (81%) were tested for influenza and 37 (26%) were PCR positive for influenza. Fever (alone) and fever combined with cough and/or sore throat were highly correlated with being PCR positive for influenza for all data sources. ILI surveillance definitions, based on symptom data from medical records only or from both medical records and self-report, were better predictors of laboratory-confirmed influenza with higher odds ratios and positive predictive values. Discussion: The choice of data source to determine ILI will depend on the patient population, outcome of interest, availability of data source, and use for clinical decision making, research, or surveillance. © Canadian Public Health Association, 2012.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research aims to use the multivariate geochemical dataset, generated by the Tellus project, to investigate the appropriate use of transformation methods to maintain the integrity of geochemical data and inherent constrained behaviour in multivariate relationships. The widely used normal score transform is compared with the use of a stepwise conditional transform technique. The Tellus Project, managed by GSNI and funded by the Department of Enterprise Trade and Development and the EU’s Building Sustainable Prosperity Fund, involves the most comprehensive geological mapping project ever undertaken in Northern Ireland. Previous study has demonstrated spatial variability in the Tellus data but geostatistical analysis and interpretation of the datasets requires use of an appropriate methodology that reproduces the inherently complex multivariate relations. Previous investigation of the Tellus geochemical data has included use of Gaussian-based techniques. However, earth science variables are rarely Gaussian, hence transformation of data is integral to the approach. The multivariate geochemical dataset generated by the Tellus project provides an opportunity to investigate the appropriate use of transformation methods, as required for Gaussian-based geostatistical analysis. In particular, the stepwise conditional transform is investigated and developed for the geochemical datasets obtained as part of the Tellus project. The transform is applied to four variables in a bivariate nested fashion due to the limited availability of data. Simulation of these transformed variables is then carried out, along with a corresponding back transformation to original units. Results show that the stepwise transform is successful in reproducing both univariate statistics and the complex bivariate relations exhibited by the data. Greater fidelity to multivariate relationships will improve uncertainty models, which are required for consequent geological, environmental and economic inferences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatically determining and assigning shared and meaningful text labels to data extracted from an e-Commerce web page is a challenging problem. An e-Commerce web page can display a list of data records, each of which can contain a combination of data items (e.g. product name and price) and explicit labels, which describe some of these data items. Recent advances in extraction techniques have made it much easier to precisely extract individual data items and labels from a web page, however, there are two open problems: 1. assigning an explicit label to a data item, and 2. determining labels for the remaining data items. Furthermore, improvements in the availability and coverage of vocabularies, especially in the context of e-Commerce web sites, means that we now have access to a bank of relevant, meaningful and shared labels which can be assigned to extracted data items. However, there is a need for a technique which will take as input a set of extracted data items and assign automatically to them the most relevant and meaningful labels from a shared vocabulary. We observe that the Information Extraction (IE) community has developed a great number of techniques which solve problems similar to our own. In this work-in-progress paper we propose our intention to theoretically and experimentally evaluate different IE techniques to ascertain which is most suitable to solve this problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The last decade has witnessed an unprecedented growth in availability of data having spatio-temporal characteristics. Given the scale and richness of such data, finding spatio-temporal patterns that demonstrate significantly different behavior from their neighbors could be of interest for various application scenarios such as – weather modeling, analyzing spread of disease outbreaks, monitoring traffic congestions, and so on. In this paper, we propose an automated approach of exploring and discovering such anomalous patterns irrespective of the underlying domain from which the data is recovered. Our approach differs significantly from traditional methods of spatial outlier detection, and employs two phases – i) discovering homogeneous regions, and ii) evaluating these regions as anomalies based on their statistical difference from a generalized neighborhood. We evaluate the quality of our approach and distinguish it from existing techniques via an extensive experimental evaluation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In dynamic spectrum access networks, cognitive radio terminals monitor their spectral environment in order to detect and opportunistically access unoccupied frequency channels. The overall performance of such networks depends on the spectrum occupancy or availability patterns. Accurate knowledge on the channel availability enables optimum performance of such networks in terms of spectrum and energy efficiency. This work proposes a novel probabilistic channel availability model that can describe the channel availability in different polarizations for mobile cognitive radio terminals that are likely to change their orientation during their operation. A Gaussian approximation is used to model the empirical occupancy data that was obtained through a measurement campaign in the cellular frequency bands within a realistic operational scenario.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cognitive radio has been proposed as a means of improving the spectrum utilisation and increasing spectrum efficiency of wireless systems. This can be achieved by allowing cognitive radio terminals to monitor their spectral environment and opportunistically access the unoccupied frequency channels. Due to the opportunistic nature of cognitive radio, the overall performance of such networks depends on the spectrum occupancy or availability patterns. Appropriate knowledge on channel availability can optimise the sensing performance in terms of spectrum and energy efficiency. This work proposes a statistical framework for the channel availability in the polarization domain. A Gaussian Normal approximation is used to model real-world occupancy data obtained through a measurement campaign in the cellular frequency bands within a realistic scenario.