68 resultados para random forest data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this paper is to equip readers with an understanding of the principles of qualitative data analysis and offer a practical example of how analysis might be undertaken in an interview-based study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Irish and UK governments, along with other countries, have made a commitment to limit the concentrations of greenhouse gases in the atmosphere by reducing emissions from the burning of fossil fuels. This can be achieved (in part) through increasing the sequestration of CO2 from the atmosphere including monitoring the amount stored in vegetation and soils. A large proportion of soil carbon is held within peat due to the relatively high carbon density of peat and organic-rich soils. This is particularly important for a country such as Ireland, where some 16% of the land surface is covered by peat. For Northern Ireland, it has been estimated that the total amount of carbon stored in vegetation is 4.4Mt compared to 386Mt stored within peat and soils. As a result it has become increasingly important to measure and monitor changes in stores of carbon in soils. The conservation and restoration of peat covered areas, although ongoing for many years, has become increasingly important. This is summed up in current EU policy outlined by the European Commission (2012) which seeks to assess the relative contributions of the different inputs and outputs of organic carbon and organic matter to and from soil. Results are presented from the EU-funded Tellus Border Soil Carbon Project (2011 to 2013) which aimed to improve current estimates of carbon in soil and peat across Northern Ireland and the bordering counties of the Republic of Ireland.
Historical reports and previous surveys provide baseline data. To monitor change in peat depth and soil organic carbon, these historical data are integrated with more recently acquired airborne geophysical (radiometric) data and ground-based geochemical data generated by two surveys, the Tellus Project (2004-2007: covering Northern Ireland) and the EU-funded Tellus Border project (2011-2013) covering the six bordering counties of the Republic of Ireland, Donegal, Sligo, Leitrim, Cavan, Monaghan and Louth. The concept being applied is that saturated organic-rich soil and peat attenuate gamma-radiation from underlying soils and rocks. This research uses the degree of spatial correlation (coregionalization) between peat depth, soil organic carbon (SOC) and the attenuation of the radiometric signal to update a limited sampling regime of ground-based measurements with remotely acquired data. To comply with the compositional nature of the SOC data (perturbations of loss on ignition [LOI] data), a compositional data analysis approach is investigated. Contemporaneous ground-based measurements allow corroboration for the updated mapped outputs. This provides a methodology that can be used to improve estimates of soil carbon with minimal impact to sensitive habitats (like peat bogs), but with maximum output of data and knowledge.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1) Executive Summary
Legislation (Autism Act NI, 2011), a cross-departmental strategy (Autism Strategy 2013-2020) and a first action plan (2013-2016) have been developed in Northern Ireland in order to support individuals and families affected by Autism Spectrum Disorder (ASD) without a prior thorough baseline assessment of need. At the same time, there are large existing data sets about the population in NI that had never been subjected to a secondary data analysis with regards to data on ASD. This report covers the first comprehensive secondary data analysis and thereby aims to inform future policy and practice.
Following a search of all existing, large-scale, regional or national data sets that were relevant to the lives of individuals and families affected by Autism Spectrum Disorder (ASD) in Northern Ireland, extensive secondary data analyses were carried out. The focus of these secondary data analyses was to distill any ASD related data from larger generic data sets. The findings are reported for each data set and follow a lifespan perspective, i.e., data related to children is reported first before data related to adults.
Key findings:
Autism Prevalence:
Of children born in 2000 in the UK,
• 0.9% (1:109) were reported to have ASD, when they were 5-year old in 2005;
• 1.8% (1:55) were reported to have ASD, when they were 7-years old in 2007;
• 3.5% (1:29) were reported to have ASD, when they were 11-year old in 2011.
In mainstream schools in Northern Ireland
• 1.2% of the children were reported to have ASD in 2006/07;
• 1.8% of the children were reported to have ASD in 2012/13.

Economic Deprivation:
• Families of children with autism (CWA) were 9%-18% worse off per week than families of children not on the autism spectrum (COA).
• Between 2006-2013 deprivation of CWA compared to COA nearly doubled as measured by eligibility for free school meals (from near 20 % to 37%)
• In 2006, CWA and COA experienced similar levels of deprivation (approx. 20%), by 2013, a considerable deprivation gap had developed, with CWA experienced 6% more deprivation than COA.
• Nearly 1/3 of primary school CWA lived in the most deprived areas in Northern Ireland.
• Nearly ½ of children with Asperger’s Syndrome who attended special school lived in the most deprived areas.

Unemployment:
• Mothers of CWA were 6% less likely to be employed than mothers of COA.
• Mothers of CWA earned 35%-56% less than mothers of COA.
• CWA were 9% less likely to live in two income families than COA.

Health:
• Pre-diagnosis, CWA were more likely than COA to have physical health problems, including walking on level ground, speech and language, hearing, eyesight, and asthma.
• Aged 3 years of age CWA experienced poorer emotional and social health than COA, this difference increased significantly by the time they were 7 years of age.
• Mothers of young CWA had lower levels of life satisfaction and poorer mental health than mothers of young COA.
Education:
• In mainstream education, children with ASD aged 11-16 years reported less satisfaction with their social relationships than COA.
• Younger children with ASD (aged 5 and 7 years) were less likely to enjoy school, were bullied more, and were more reluctant to attend school than COA.
• CWA attended school 2-3 weeks less than COA .
• Children with Asperger’s Syndrome in special schools missed the equivalent of 8-13 school days more than children with Asperger’s Syndrome in mainstream schools.
• Children with ASD attending mainstream schooling were less likely to gain 5+ GCSEs A*-C or subsequently attend university.



Further and Higher Education:
• Enrolment rates for students with ASD have risen in Further Education (FE), from 0% to 0.7%.
• Enrolment rates for students with ASD have risen in Higher Education (HE), from 0.28% to 0.45%.
• Students with ASD chose to study different subjects than students without ASD, although other factors, e.g., gender, age etc. may have played a part in subject selection.
• Students with ASD from NI were more likely than students without ASD to choose Northern Irish HE Institutions rather than study outside NI.

Participation in adult life and employment:
• A small number of adults with ASD (n=99) have benefitted from DES employment provision over the past 12 years.
• It is unknown how many adults with ASD have received employment support elsewhere (e.g. Steps to Work).

Awareness and Attitudes in the General Population:
• In both the 2003 and 2012 NI Life and Times Survey (NILTS), NI public reported positive attitudes towards the inclusion of children with ASD in mainstream education (see also BASE Project Vol. 2).

Gap Analysis Recommendations:
This was the first comprehensive secondary analysis with regards to ASD of existing large-scale data sets in Northern Ireland. Data gaps were identified and further replications would benefit from the following data inclusion:
• ASD should be recorded routinely in the following datasets:
o Census;
o Northern Ireland Survey of Activity Limitation (NISALD);
o Training for Success/Steps to work; Steps to Success;
o Travel survey;
o Hate crime; and
o Labour Force Survey.
Data should be collected on the destinations/qualifications of special school leavers.
• NILT Survey autism module should be repeated in 5 years time (2017) (see full report of 1st NILT Survey autism module 2012 in BASE Project Report Volume 2).
• General public attitudes and awareness should be assessed for children and young people, using the Young Life and Times Survey (YLT) and the Kids Life and Times Survey (KLT); (this work is underway, Dillenburger, McKerr, Schubolz, & Lloyd, 2014-2015).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

TAP pulse responses are normally analysed using moments, which are integrals of the full TAP pulse response. However, in some cases the entire pulse response may not be recorded due to technical reasons, thereby compromising any data analysis due to moments generated from incomplete pulse responses. The current work discloses the development of a function which mathematically expands the tail of a TAP pulse response, so that the TAP data analysis can be accurately conducted. This newly developed analysis method has been applied to the oxidative dehydrogenation of ethane over Co–Cr–Sn–WOx/α-Al2O3 and Co–Cr–Sn–WOx/α-Al2O3 catalysts as a case study.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantile normalization (QN) is a technique for microarray data processing and is the default normalization method in the Robust Multi-array Average (RMA) procedure, which was primarily designed for analysing gene expression data from Affymetrix arrays. Given the abundance of Affymetrix microarrays and the popularity of the RMA method, it is crucially important that the normalization procedure is applied appropriately. In this study we carried out simulation experiments and also analysed real microarray data to investigate the suitability of RMA when it is applied to dataset with different groups of biological samples. From our experiments, we showed that RMA with QN does not preserve the biological signal included in each group, but rather it would mix the signals between the groups. We also showed that the Median Polish method in the summarization step of RMA has similar mixing effect. RMA is one of the most widely used methods in microarray data processing and has been applied to a vast volume of data in biomedical research. The problematic behaviour of this method suggests that previous studies employing RMA could have been misadvised or adversely affected. Therefore we think it is crucially important that the research community recognizes the issue and starts to address it. The two core elements of the RMA method, quantile normalization and Median Polish, both have the undesirable effects of mixing biological signals between different sample groups, which can be detrimental to drawing valid biological conclusions and to any subsequent analyses. Based on the evidence presented here and that in the literature, we recommend exercising caution when using RMA as a method of processing microarray gene expression data, particularly in situations where there are likely to be unknown subgroups of samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistics are regularly used to make some form of comparison between trace evidence or deploy the exclusionary principle (Morgan and Bull, 2007) in forensic investigations. Trace evidence are routinely the results of particle size, chemical or modal analyses and as such constitute compositional data. The issue is that compositional data including percentages, parts per million etc. only carry relative information. This may be problematic where a comparison of percentages and other constraint/closed data is deemed a statistically valid and appropriate way to present trace evidence in a court of law. Notwithstanding an awareness of the existence of the constant sum problem since the seminal works of Pearson (1896) and Chayes (1960) and the introduction of the application of log-ratio techniques (Aitchison, 1986; Pawlowsky-Glahn and Egozcue, 2001; Pawlowsky-Glahn and Buccianti, 2011; Tolosana-Delgado and van den Boogaart, 2013) the problem that a constant sum destroys the potential independence of variances and covariances required for correlation regression analysis and empirical multivariate methods (principal component analysis, cluster analysis, discriminant analysis, canonical correlation) is all too often not acknowledged in the statistical treatment of trace evidence. Yet the need for a robust treatment of forensic trace evidence analyses is obvious. This research examines the issues and potential pitfalls for forensic investigators if the constant sum constraint is ignored in the analysis and presentation of forensic trace evidence. Forensic case studies involving particle size and mineral analyses as trace evidence are used to demonstrate the use of a compositional data approach using a centred log-ratio (clr) transformation and multivariate statistical analyses.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is substantial international variation in human papillomavirus (HPV) prevalence; this study details the first report from Northern Ireland and additionally provides a systematic review and meta-analysis pooling the prevalence of high-risk (HR-HPV) subtypes among women with normal cytology in the UK and Ireland. Between February and December 2009, routine liquid based cytology (LBC) samples were collected for HPV detection (Roche Cobas® 4800 [PCR]) among unselected women attending for cervical cytology testing. Four electronic databases, including MEDLINE, were then searched from their inception till April 2011. A random effects meta-analysis was used to calculate a pooled HR-HPV prevalence and associated 95% confidence intervals (CI). 5,712 women, mean age 39 years (±SD 11.9 years; range 20-64 years), were included in the analysis, of which 5,068 (88.7%), 417 (7.3%) and 72 (1.3%) had normal, low, and high-grade cytological findings, respectively. Crude HR-HPV prevalence was 13.2% (95% CI, 12.7-13.7) among women with normal cytology and increased with cytological grade. In meta-analysis the pooled HR-HPV prevalence among those with normal cytology was 0.12 (95% CIs, 0.10-0.14; 21 studies) with the highest prevalence in younger women. HPV 16 and HPV 18 specific estimates were 0.03 (95% CI, 0.02-0.05) and 0.01 (95% CI, 0.01-0.02), respectively. The findings of this Northern Ireland study and meta-analysis verify the prevalent nature of HPV infection among younger women. Reporting of the type-specific prevalence of HPV infection is relevant for evaluating the impact of future HPV immunization initiatives, particularly against HR-HPV types other than HPV 16 and 18. J. Med. Virol. 85:295-308, 2013. © 2012 Wiley Periodicals, Inc. Copyright © 2012 Wiley Periodicals, Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To investigate the impact of smoking and smoking cessation on cardiovascular mortality, acute coronary events, and stroke events in people aged 60 and older, and to calculate and report risk advancement periods for cardiovascular mortality in addition to traditional epidemiological relative risk measures.

DESIGN: Individual participant meta-analysis using data from 25 cohorts participating in the CHANCES consortium. Data were harmonised, analysed separately employing Cox proportional hazard regression models, and combined by meta-analysis.

RESULTS: Overall, 503,905 participants aged 60 and older were included in this study, of whom 37,952 died from cardiovascular disease. Random effects meta-analysis of the association of smoking status with cardiovascular mortality yielded a summary hazard ratio of 2.07 (95% CI 1.82 to 2.36) for current smokers and 1.37 (1.25 to 1.49) for former smokers compared with never smokers. Corresponding summary estimates for risk advancement periods were 5.50 years (4.25 to 6.75) for current smokers and 2.16 years (1.38 to 2.39) for former smokers. The excess risk in smokers increased with cigarette consumption in a dose-response manner, and decreased continuously with time since smoking cessation in former smokers. Relative risk estimates for acute coronary events and for stroke events were somewhat lower than for cardiovascular mortality, but patterns were similar.

CONCLUSIONS: Our study corroborates and expands evidence from previous studies in showing that smoking is a strong independent risk factor of cardiovascular events and mortality even at older age, advancing cardiovascular mortality by more than five years, and demonstrating that smoking cessation in these age groups is still beneficial in reducing the excess risk.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Long working hours might increase the risk of cardiovascular disease, but prospective evidence is scarce, imprecise, and mostly limited to coronary heart disease. We aimed to assess long working hours as a risk factor for incident coronary heart disease and stroke. 

Methods We identified published studies through a systematic review of PubMed and Embase from inception to Aug 20, 2014. We obtained unpublished data for 20 cohort studies from the Individual-Participant-Data Meta-analysis in Working Populations (IPD-Work) Consortium and open-access data archives. We used cumulative random-effects meta-analysis to combine effect estimates from published and unpublished data

Findings We included 25 studies from 24 cohorts in Europe, the USA, and Australia. The meta-analysis of coronary heart disease comprised data for 603 838 men and women who were free from coronary heart disease at baseline; the meta-analysis of stroke comprised data for 528 908 men and women who were free from stroke at baseline. Follow-up for coronary heart disease was 5·1 million person-years (mean 8·5 years), in which 4768 events were recorded, and for stroke was 3·8 million person-years (mean 7·2 years), in which 1722 events were recorded. In cumulative meta-analysis adjusted for age, sex, and socioeconomic status, compared with standard hours (35-40 h per week), working long hours (≥55 h per week) was associated with an increase in risk of incident coronary heart disease (relative risk [RR] 1·13, 95% CI 1·02-1·26; p=0·02) and incident stroke (1·33, 1·11-1·61; p=0·002). The excess risk of stroke remained unchanged in analyses that addressed reverse causation, multivariable adjustments for other risk factors, and different methods of stroke ascertainment (range of RR estimates 1·30-1·42). We recorded a dose-response association for stroke, with RR estimates of 1·10 (95% CI 0·94-1·28; p=0·24) for 41-48 working hours, 1·27 (1·03-1·56; p=0·03) for 49-54 working hours, and 1·33 (1·11-1·61; p=0·002) for 55 working hours or more per week compared with standard working hours (ptrend<0·0001).

Interpretation Employees who work long hours have a higher risk of stroke than those working standard hours; the association with coronary heart disease is weaker. These findings suggest that more attention should be paid to the management of vascular risk factors in individuals who work long hours. 

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main curative therapy for patients with nonsmall cell lung cancer is surgery. Despite this, the survival rate is only 50%, therefore it is important to more efficiently diagnose and predict prognosis for lung cancer patients. Raman spectroscopy is useful in the diagnosis of malignant and premalignant lesions. The aim of this study is to investigate the ability of Raman microscopy to diagnose lung cancer from surgically resected tissue sections, and predict the prognosis of these patients. Tumor tissue sections from curative resections are mapped by Raman microscopy and the spectra analzsed using multivariate techniques. Spectra from the tumor samples are also compared with their outcome data to define their prognostic significance. Using principal component analysis and random forest classification, Raman microscopy differentiates malignant from normal lung tissue. Principal component analysis of 34 tumor spectra predicts early postoperative cancer recurrence with a sensitivity of 73% and specificity of 74%. Spectral analysis reveals elevated porphyrin levels in the normal samples and more DNA in the tumor samples. Raman microscopy can be a useful technique for the diagnosis and prognosis of lung cancer patients receiving surgery, and for elucidating the biochemical properties of lung tumors. (C) 2010 Society of Photo-Optical Instrumentation Engineers. [DOI: 10.1117/1.3323088]