53 resultados para FUNCTIONAL DATA ANALYSIS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrospective clinical datasets are often characterized by a relatively small sample size and many missing data. In this case, a common way for handling the missingness consists in discarding from the analysis patients with missing covariates, further reducing the sample size. Alternatively, if the mechanism that generated the missing allows, incomplete data can be imputed on the basis of the observed data, avoiding the reduction of the sample size and allowing methods to deal with complete data later on. Moreover, methodologies for data imputation might depend on the particular purpose and might achieve better results by considering specific characteristics of the domain. The problem of missing data treatment is studied in the context of survival tree analysis for the estimation of a prognostic patient stratification. Survival tree methods usually address this problem by using surrogate splits, that is, splitting rules that use other variables yielding similar results to the original ones. Instead, our methodology consists in modeling the dependencies among the clinical variables with a Bayesian network, which is then used to perform data imputation, thus allowing the survival tree to be applied on the completed dataset. The Bayesian network is directly learned from the incomplete data using a structural expectation–maximization (EM) procedure in which the maximization step is performed with an exact anytime method, so that the only source of approximation is due to the EM formulation itself. On both simulated and real data, our proposed methodology usually outperformed several existing methods for data imputation and the imputation so obtained improved the stratification estimated by the survival tree (especially with respect to using surrogate splits).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Urothelial cancer (UC) is highly recurrent and can progress from non-invasive (NMIUC) to a more aggressive muscle-invasive (MIUC) subtype that invades the muscle tissue layer of the bladder. We present a proof of principle study that network-based features of gene pairs can be used to improve classifier performance and the functional analysis of urothelial cancer gene expression data. In the first step of our procedure each individual sample of a UC gene expression dataset is inflated by gene pair expression ratios that are defined based on a given network structure. In the second step an elastic net feature selection procedure for network-based signatures is applied to discriminate between NMIUC and MIUC samples. We performed a repeated random subsampling cross validation in three independent datasets. The network signatures were characterized by a functional enrichment analysis and studied for the enrichment of known cancer genes. We observed that the network-based gene signatures from meta collections of proteinprotein interaction (PPI) databases such as CPDB and the PPI databases HPRD and BioGrid improved the classification performance compared to single gene based signatures. The network based signatures that were derived from PPI databases showed a prominent enrichment of cancer genes (e.g., TP53, TRIM27 and HNRNPA2Bl). We provide a novel integrative approach for large-scale gene expression analysis for the identification and development of novel diagnostical targets in bladder cancer. Further, our method allowed to link cancer gene associations to network-based expression signatures that are not observed in gene-based expression signatures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Purpose – The purpose of this paper is to present an analysis of media representation of business ethics within 62 international newspapers to explore the longitudinal and contextual evolution of business ethics and associated terminology. Levels of coverage and contextual analysis of the content of the articles are used as surrogate measures of the penetration of business ethics concepts into society. Design/methodology/approach – This paper uses a text mining application based on two samples of data: analysis of 62 national newspapers in 21 countries from 1990 to 2008; analysis of the content of two samples of articles containing the term business ethics (comprised of 100 newspaper articles spread over an 18-year period from a sample of US and UK newspapers). Findings – The paper demonstrates increased coverage of sustainability topics within the media over the last 18 years associated with events such as the Rio Summit. Whilst some peaks are associated with business ethics scandals, the overall coverage remains steady. There is little apparent use in the media of concepts such as corporate citizenship. The academic community and company ethical codes appear to adopt a wider definition of business ethics more akin to that associated with sustainability, in comparison with the focus taken by the media, especially in the USA. Coverage demonstrates clear regional bias and contextual analysis of the articles in the UK and USA also shows interesting parallels and divergences in the media representation of business ethics. Originality/value – A promising avenue to explore how the evolution of sustainability issues including business ethics can be tracked within a societal context.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The analysis of gene function through RNA interference (RNAi)-based reverse genetics in plant parasitic nematodes (PPNs) remains inexplicably reliant on the use of long double-stranded RNA (dsRNA) silencing triggers; a practice inherently disadvantageous due to the introduction of superfluous dsRNA sequence. increasing chances of aberrant or off-target gene silencing through interactions between nascent short interfering RNAs (siRNAs) and non-cognate mRNA targets. Recently, we have shown that non-nematode, long dsRNAs have a propensity to elicit profound impacts on the phenotype and migrational abilities of both root knot and cyst nematodes. This study presents, to our knowledge for the first time, gene-specific knockdown of FMRFamide-like peptide (flp) transcripts, using discrete 21 bp siRNAs in potato cyst nematode Globodera pallida, and root knot nematode Meloidogyne incognita infective (J2) stage juveniles. Both knockdown at the transcript level through quantitative (q)PCR analysis and functional data derived from migration assay, indicate that siRNAs targeting certain areas of the FMRFamide-like peptide (FLP) transcripts are potent and specific in the silencing of gene function. In addition, we present a method of manipulating siRNA activity through the management of strand thermodynamics. Initial evaluation of strand thermodynamics as a determinant of RNA-induced Silencing Complex (RISC) strand selection (inferred from knockdown efficacy) in the siRNAs presented here suggested that the purported influence of 5' stand stability on guide incorporation may be somewhat promiscuous. However, we have found that on strategically incorporating base mismatches in the sense strand of a G. pallida-specific siRNA we could specifically increase or decrease the knockdown of its target (specific to the antisense strand), presumably through creating more favourable thermodynamic profiles for incorporation of either the sense (non-target-specific) or antisense (target-specific) strand into a cleavage-competent RISC. Whilst the efficacy of similar approaches to siRNA modification has been demonstrated in the context of Drosophila whole-cell lysate preparations and in mammalian cell cultures, it remained to be seen how these sense strand mismatches may impact on gene silencing in vivo, in relation to different targets and in different sequence contexts. This work presents the first application of such an approach in a whole organism; initial results show promise. (C) 2009 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Estimation and detection of the hemodynamic response (HDR) are of great importance in functional MRI (fMRI) data analysis. In this paper, we propose the use of three H 8 adaptive filters (finite memory, exponentially weighted, and time-varying) for accurate estimation and detection of the HDR. The H 8 approach is used because it safeguards against the worst case disturbances and makes no assumptions on the (statistical) nature of the signals [B. Hassibi and T. Kailath, in Proc. ICASSP, 1995, vol. 2, pp. 949-952; T. Ratnarajah and S. Puthusserypady, in Proc. 8th IEEE Workshop DSP, 1998, pp. 1483-1487]. Performances of the proposed techniques are compared to the conventional t-test method as well as the well-known LMSs and recursive least squares algorithms. Extensive numerical simulations show that the proposed methods result in better HDR estimations and activation detections.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND:

Increased superoxide anion production increases oxidative stress and reduces nitric oxide bioactivity in vascular disease states. NAD(P)H oxidase is an important source of superoxide in human blood vessels, and some studies suggest a possible association between polymorphisms in the NAD(P)H oxidase CYBA gene and atherosclerosis; however, no functional data address this hypothesis. We examined the relationships between the CYBA C242T polymorphism and direct measurements of superoxide production in human blood vessels.

METHODS AND RESULTS:

Vascular NAD(P)H oxidase activity was determined in human saphenous veins obtained from 110 patients with coronary artery disease and identified risk factors. Immunoblotting, reverse-transcription polymerase chain reaction, and DNA sequencing showed that p22phox protein, mRNA, and 242C/T allelic variants are expressed in human blood vessels. Vascular superoxide production, both basal and NADH-stimulated, was highly variable between patients, but the presence of the CYBA 242T allele was associated with significantly reduced vascular NAD(P)H oxidase activity, independent of other clinical risk factors for atherosclerosis.

CONCLUSIONS:

Association of the CYBA 242T allele with reduced NAD(P)H oxidase activity in human blood vessels suggests that genetic variation in NAD(P)H oxidase components may play a significant role in modulating superoxide production in human atherosclerosis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Diabetic kidney disease, or diabetic nephropathy (DN), is a major complication of diabetes and the leading cause of end-stage renal disease (ESRD) that requires dialysis treatment or kidney transplantation. In addition to the decrease in the quality of life, DN accounts for a large proportion of the excess mortality associated with type 1 diabetes (T1D). Whereas the degree of glycemia plays a pivotal role in DN, a subset of individuals with poorly controlled T1D do not develop DN. Furthermore, strong familial aggregation supports genetic susceptibility to DN. However, the genes and the molecular mechanisms behind the disease remain poorly understood, and current therapeutic strategies rarely result in reversal of DN. In the GEnetics of Nephropathy: an International Effort (GENIE) consortium, we have undertaken a meta-analysis of genome-wide association studies (GWAS) of T1D DN comprising ~2.4 million single nucleotide polymorphisms (SNPs) imputed in 6,691 individuals. After additional genotyping of 41 top ranked SNPs representing 24 independent signals in 5,873 individuals, combined meta-analysis revealed association of two SNPs with ESRD: rs7583877 in the AFF3 gene (P?=?1.2×10(-8)) and an intergenic SNP on chromosome 15q26 between the genes RGMA and MCTP2, rs12437854 (P?=?2.0×10(-9)). Functional data suggest that AFF3 influences renal tubule fibrosis via the transforming growth factor-beta (TGF-ß1) pathway. The strongest association with DN as a primary phenotype was seen for an intronic SNP in the ERBB4 gene (rs7588550, P?=?2.1×10(-7)), a gene with type 2 diabetes DN differential expression and in the same intron as a variant with cis-eQTL expression of ERBB4. All these detected associations represent new signals in the pathogenesis of DN.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this study was to explore the care processes experienced by community-dwelling adults dying from advanced heart failure, their family caregivers, and their health-care providers. A descriptive qualitative design was used to guide data collection, analysis, and interpretation. The sample comprised 8 patients, 10 informal caregivers, 11 nurses, 3 physicians, and 3 pharmacists. Data analysis revealed that palliative care was influenced by unique contextual factors (i.e., cancer model of palliative care, limited access to resources, prognostication challenges). Patients described choosing interventions and living with fatigue, pain, shortness of breath, and functional decline. Family caregivers described surviving caregiver burden and drawing on their faith. Health professionals described their role as trying to coordinate care, building expertise, managing medications, and optimizing interprofessional collaboration. Participants strove towards 3 outcomes: effective symptom management, satisfaction with care, and a peaceful death. © McGill University School of Nursing.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Genetic risk factors for chronic kidney disease (CKD) are being identified through international collaborations. By comparison, epigenetic risk factors for CKD have only recently been considered using population-based approaches. DNA methylation is a major epigenetic modification that is associated with complex diseases, so we investigated methylome-wide loci for association with CKD. A total of 485,577 unique features were evaluated in 255 individuals with CKD (cases) and 152 individuals without evidence of renal disease (controls). Following stringent quality control, raw data were quantile normalized and β values calculated to reflect the methylation status at each site. The difference in methylation status was evaluated between cases and controls with resultant P values adjusted for multiple testing. Genes with significantly increased and decreased levels of DNA methylation were considered for biological relevance by functional enrichment analysis using KEGG pathways in Partek Genomics Suite. Twenty-three genes, where more than one CpG per loci was identified with Padjusted < 10−8, demonstrated significant methylation changes associated with CKD and additional support for these associated loci was sought from published literature. Strong biological candidates for CKD that showed statistically significant differential methylation include CUX1, ELMO1, FKBP5, INHBA-AS1, PTPRN2, and PRKAG2 genes; several genes are differentially methylated in kidney tissue and RNA-seq supports a functional role for differential methylation in ELMO1 and PRKAG2 genes. This study reports the largest, most comprehensive, genome-wide quantitative evaluation of DNA methylation for association with CKD. Evidence confirming methylation sites influence development of CKD would stimulate research to identify epigenetic therapies that might be clinically useful for CKD.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Context. Comet 67P/Churyumov-Gerasimenko is the target of the European Space Agency Rosetta spacecraft rendez-vous mission. Detailed physical characteristation of the comet before arrival is important for mission planning as well as providing a test bed for ground-based observing and data-analysis methods. Aims: To conduct a long-term observational programme to characterize the physical properties of the nucleus of the comet, via ground-based optical photometry, and to combine our new data with all available nucleus data from the literature. Methods: We applied aperture photometry techniques on our imaging data and combined the extracted rotational lightcurves with data from the literature. Optical lightcurve inversion techniques were applied to constrain the spin state of the nucleus and its broad shape. We performed a detailed surface thermal analysis with the shape model and optical photometry by incorporating both into the new Advanced Thermophysical Model (ATPM), along with all available Spitzer 8-24 μm thermal-IR flux measurements from the literature. Results: A convex triangular-facet shape model was determined with axial ratios b/a = 1.239 and c/a = 0.819. These values can vary by as much as 7% in each axis and still result in a statistically significant fit to the observational data. Our best spin state solution has Psid = 12.76137 ± 0.00006 h, and a rotational pole orientated at Ecliptic coordinates λ = 78°(±10°), β = + 58°(±10°). The nucleus phase darkening behaviour was measured and best characterized using the IAU HG system. Best fit parameters are: G = 0.11 ± 0.12 and HR(1,1,0) = 15.31 ± 0.07. Our shape model combined with the ATPM can satisfactorily reconcile all optical and thermal-IR data, with the fit to the Spitzer 24 μm data taken in February 2004 being exceptionally good. We derive a range of mutually-consistent physical parameters for each thermal-IR data set, including effective radius, geometric albedo, surface thermal inertia and roughness fraction. Conclusions: The overall nucleus dimensions are well constrained and strongly imply a broad nucleus shape more akin to comet 9P/Tempel 1, rather than the highly elongated or "bi-lobed" nuclei seen for comets 103P/Hartley 2 or 8P/Tuttle. The derived low thermal inertia of

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recent technological advances have increased the quantity of movement data being recorded. While valuable knowledge can be gained by analysing such data, its sheer volume creates challenges. Geovisual analytics, which helps the human cognition process by using tools to reason about data, offers powerful techniques to resolve these challenges. This paper introduces such a geovisual analytics environment for exploring movement trajectories, which provides visualisation interfaces, based on the classic space-time cube. Additionally, a new approach, using the mathematical description of motion within a space-time cube, is used to determine the similarity of trajectories and forms the basis for clustering them. These techniques were used to analyse pedestrian movement. The results reveal interesting and useful spatiotemporal patterns and clusters of pedestrians exhibiting similar behaviour.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Research over the past two decades on the Holocene sediments from the tide dominated west side of the lower Ganges delta has focussed on constraining the sedimentary environment through grain size distributions (GSD). GSD has traditionally been assessed through the use of probability density function (PDF) methods (e.g. log-normal, log skew-Laplace functions), but these approaches do not acknowledge the compositional nature of the data, which may compromise outcomes in lithofacies interpretations. The use of PDF approaches in GSD analysis poses a series of challenges for the development of lithofacies models, such as equifinal distribution coefficients and obscuring the empirical data variability. In this study a methodological framework for characterising GSD is presented through compositional data analysis (CODA) plus a multivariate statistical framework. This provides a statistically robust analysis of the fine tidal estuary sediments from the West Bengal Sundarbans, relative to alternative PDF approaches.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Context. The Public European Southern Observatory Spectroscopic Survey of Transient Objects (PESSTO) began as a public spectroscopic survey in April 2012. PESSTO classifies transients from publicly available sources and wide-field surveys, and selects science targets for detailed spectroscopic and photometric follow-up. PESSTO runs for nine months of the year, January - April and August - December inclusive, and typically has allocations of 10 nights per month. 

Aims. We describe the data reduction strategy and data products that are publicly available through the ESO archive as the Spectroscopic Survey data release 1 (SSDR1). 

Methods. PESSTO uses the New Technology Telescope with the instruments EFOSC2 and SOFI to provide optical and NIR spectroscopy and imaging. We target supernovae and optical transients brighter than 20.5<sup>m</sup> for classification. Science targets are selected for follow-up based on the PESSTO science goal of extending knowledge of the extremes of the supernova population. We use standard EFOSC2 set-ups providing spectra with resolutions of 13-18 Å between 3345-9995 Å. A subset of the brighter science targets are selected for SOFI spectroscopy with the blue and red grisms (0.935-2.53 μm and resolutions 23-33 Å) and imaging with broadband JHK<inf>s</inf> filters. 

Results. This first data release (SSDR1) contains flux calibrated spectra from the first year (April 2012-2013). A total of 221 confirmed supernovae were classified, and we released calibrated optical spectra and classifications publicly within 24 h of the data being taken (via WISeREP). The data in SSDR1 replace those released spectra. They have more reliable and quantifiable flux calibrations, correction for telluric absorption, and are made available in standard ESO Phase 3 formats. We estimate the absolute accuracy of the flux calibrations for EFOSC2 across the whole survey in SSDR1 to be typically ∼15%, although a number of spectra will have less reliable absolute flux calibration because of weather and slit losses. Acquisition images for each spectrum are available which, in principle, can allow the user to refine the absolute flux calibration. The standard NIR reduction process does not produce high accuracy absolute spectrophotometry but synthetic photometry with accompanying JHK<inf>s</inf> imaging can improve this. Whenever possible, reduced SOFI images are provided to allow this. 

Conclusions. Future data releases will focus on improving the automated flux calibration of the data products. The rapid turnaround between discovery and classification and access to reliable pipeline processed data products has allowed early science papers in the first few months of the survey.