820 resultados para Lanczos, Linear systems, Generalized cross validation
Resumo:
In small islands, a freshwater lens can develop due to the recharge induced by rain. Magnitude and spatial distribution of this recharge control the elevation of freshwater and the depth of its interface with salt water. Therefore, the study of lens morphology gives useful information on both the recharge and water uptake due to evapotranspiration by vegetation. Electrical resistivity tomography was applied on a small coral reef island, giving relevant information on the lens structure. Variable density groundwater flow models were then applied to simulate freshwater behavior. Cross validation of the geoelectrical model and the groundwater model showed that recharge exceeds water uptake in dunes with little vegetation, allowing the lens to develop. Conversely, in the low-lying and densely vegetated sectors, where water uptake exceeds recharge, the lens cannot develop and seawater intrusion occurs. This combined modeling method constitutes an original approach to evaluate effective groundwater recharge in such environments.
[Comte, J.-C., O. Banton, J.-L. Join, and G. Cabioch (2010), Evaluation of effective groundwater recharge of freshwater lens in small islands by the combined modeling of geoelectrical data and water heads, Water Resour. Res., 46, W06601, doi:10.1029/2009WR008058.]
Resumo:
Schizophrenia is a common psychotic mental disorder that is believed to result from the effects of multiple genetic and environmental factors. In this study, we explored gene-gene interactions and main effects in both case-control (657 cases and 411 controls) and family-based (273 families, 1350 subjects) datasets of English or Irish ancestry. Fifty three markers in 8 genes were genotyped in the family sample and 44 markers in 7 genes were genotyped in the case-control sample. The Multifactor Dimensionality Reduction Pedigree Disequilibrium Test (MDR-PDT) was used to examine epistasis in the family dataset and a 3-locus model was identified (permuted p=0.003). The 3-locus model involved the IL3 (rs2069803), RGS4 (rs2661319), and DTNBP1 (rs21319539) genes. We used MDR to analyze the case-control dataset containing the same markers typed in the RGS4, IL3 and DTNBP1 genes and found evidence of a joint effect between IL3 (rs31400) and DTNBP1 (rs760761) (cross-validation consistency 4/5, balanced prediction accuracy=56.84%, p=0.019). While this is not a direct replication, the results obtained from both the family and case-control samples collectively suggest that IL3 and DTNBP1 are likely to interact and jointly contribute to increase risk for schizophrenia. We also observed a significant main effect in DTNBP1, which survived correction for multiple comparisons, and numerous nominally significant effects in several genes. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Nitrogen Dioxide (NO2) is known to act as an environmental trigger for many respiratory illnesses. As a pollutant it is difficult to map accurately, as concentrations can vary greatly over small distances. In this study three geostatistical techniques were compared, producing maps of NO2 concentrations in the United Kingdom (UK). The primary data source for each technique was NO2 point data, generated from background automatic monitoring and background diffusion tubes, which are analysed by different laboratories on behalf of local councils and authorities in the UK. The techniques used were simple kriging (SK), ordinary kriging (OK) and simple kriging with a locally varying mean (SKlm). SK and OK make use of the primary variable only. SKlm differs in that it utilises additional data to inform prediction, and hence potentially reduces uncertainty. The secondary data source was Oxides of Nitrogen (NOx) derived from dispersion modelling outputs, at 1km x 1km resolution for the UK. These data were used to define the locally varying mean in SKlm, using two regression approaches: (i) global regression (GR) and (ii) geographically weighted regression (GWR). Based upon summary statistics and cross-validation prediction errors, SKlm using GWR derived local means produced the most accurate predictions. Therefore, using GWR to inform SKlm was beneficial in this study.
Resumo:
We determine generalized cross sections for two-photon double ionization of He in the photon energy region between 40.7 and 47 eV where absorption of two photons can lead to non-sequential double ionization only. The present cross sections, obtained in R-matrix Floquet theory, agree with cross sections obtained from time-dependent calculations. By examining the ratio of two-photon double ionization to two-photon single ionization, we demonstrate that core excitation effects at an intensity of 10(13) W cm(-2) are relatively unimportant at 45 eV, but that they are significant at other photon energies.
Resumo:
In 2004 nineteen scientists from fourteen institutions in seven countries
collaborated in the landmark study described in chapter 2 (Thomas et al., 2004a). This chapter provides an overview of results of studies published subsequently and assesses how much, and why, new results differ from those of Thomas et al.
Some species distribution modeling (SDM) studies are directly comparable to the Thomas et al. estimates. Others using somewhat different methods nonetheless illuminate whether the original estimates were of the right order of magnitude. Climate similarity models (Williams et al., 2007; Williams and Jackson, 2007), biome, and vegetation dynamic models (Perry and Enright, 2006) have also been
applied in the context of climate change, providing interesting opportunities
for comparison and cross-validation with results from SDMs.
This chapter concludes with an assessment of whether the range of extinction risk estimates presented in 2004 can be narrowed, and whether the mean estimate should be revised upward or downward. To set the stage for these analyses, the chapter begins with brief reviews of advances in climate modeling and species modeling since 2004.
Resumo:
Health care research includes many studies that combine quantitative and qualitative methods. In this paper, we revisit the quantitative-qualitative debate and review the arguments for and against using mixed-methods. In addition, we discuss the implications stemming from our view, that the paradigms upon which the methods are based have a different view of reality and therefore a different view of the phenomenon under study. Because the two paradigms do not study the same phenomena, quantitative and qualitative methods cannot be combined for cross-validation or triangulation purposes. However, they can be combined for complementary purposes. Future standards for mixed-methods research should clearly reflect this recommendation.
Resumo:
Background: More accurate coronary heart disease (CHD) prediction, specifically in middle-aged men, is needed to reduce the burden of disease more effectively. We hypothesised that a multilocus genetic risk score could refine CHD prediction beyond classic risk scores and obtain more precise risk estimates using a prospective cohort design.
Methods: Using data from nine prospective European cohorts, including 26,221 men, we selected in a case-cohort setting 4,818 healthy men at baseline, and used Cox proportional hazards models to examine associations between CHD and risk scores based on genetic variants representing 13 genomic regions. Over follow-up (range: 5-18 years), 1,736 incident CHD events occurred. Genetic risk scores were validated in men with at least 10 years of follow-up (632 cases, 1361 non-cases). Genetic risk score 1 (GRS1) combined 11 SNPs and two haplotypes, with effect estimates from previous genome-wide association studies. GRS2 combined 11 SNPs plus 4 SNPs from the haplotypes with coefficients estimated from these prospective cohorts using 10-fold cross-validation. Scores were added to a model adjusted for classic risk factors comprising the Framingham risk score and 10-year risks were derived.
Results: Both scores improved net reclassification (NRI) over the Framingham score (7.5%, p = 0.017 for GRS1, 6.5%, p = 0.044 for GRS2) but GRS2 also improved discrimination (c-index improvement 1.11%, p = 0.048). Subgroup analysis on men aged 50-59 (436 cases, 603 non-cases) improved net reclassification for GRS1 (13.8%) and GRS2 (12.5%). Net reclassification improvement remained significant for both scores when family history of CHD was added to the baseline model for this male subgroup improving prediction of early onset CHD events.
Conclusions: Genetic risk scores add precision to risk estimates for CHD and improve prediction beyond classic risk factors, particularly for middle aged men.
Resumo:
Model selection between competing models is a key consideration in the discovery of prognostic multigene signatures. The use of appropriate statistical performance measures as well as verification of biological significance of the signatures is imperative to maximise the chance of external validation of the generated signatures. Current approaches in time-to-event studies often use only a single measure of performance in model selection, such as logrank test p-values, or dichotomise the follow-up times at some phase of the study to facilitate signature discovery. In this study we improve the prognostic signature discovery process through the application of the multivariate partial Cox model combined with the concordance index, hazard ratio of predictions, independence from available clinical covariates and biological enrichment as measures of signature performance. The proposed framework was applied to discover prognostic multigene signatures from early breast cancer data. The partial Cox model combined with the multiple performance measures were used in both guiding the selection of the optimal panel of prognostic genes and prediction of risk within cross validation without dichotomising the follow-up times at any stage. The signatures were successfully externally cross validated in independent breast cancer datasets, yielding a hazard ratio of 2.55 [1.44, 4.51] for the top ranking signature.
Resumo:
In this paper the evolution of a time domain dynamic identification technique based on a statistical moment approach is presented. This technique can be used in the case of structures under base random excitations in the linear state and in the non linear one. By applying Itoˆ stochastic calculus, special algebraic equations can be obtained depending on the statistical moments of the response of the system to be identified. Such equations can be used for the dynamic identification of the mechanical parameters and of the input. The above equations, differently from many techniques in the literature, show the possibility of obtaining the identification of the dissipation characteristics independently from the input. Through the paper the first formulation of this technique, applicable to non linear systems, based on the use of a restricted class of the potential models, is presented. Further a second formulation of the technique in object, applicable to each kind of linear systems and based on the use of a class of linear models, characterized by a mass proportional damping matrix, is described.
Resumo:
The use of handheld near infrared (NIR) instrumentation, as a tool for rapid analysis, has the potential to be used widely in the animal feed sector. A comparison was made between handheld NIR and benchtop instruments in terms of proximate analysis of poultry feed using off-the-shelf calibration models and including statistical analysis. Additionally, melamine adulterated soya bean products were used to develop qualitative and quantitative calibration models from the NIRS spectral data with excellent calibration models and prediction statistics obtained. With regards to the quantitative approach, the coefficients of determination (R2) were found to be 0.94-0.99 with the corresponding values for the root mean square error of calibration and prediction were found to be 0.081-0.215 % and 0.095-0.288 % respectively. In addition, cross validation was used to further validate the models with the root mean square error of cross validation found to be 0.101-0.212 %. Furthermore, by adopting a qualitative approach with the spectral data and applying Principal Component Analysis, it was possible to discriminate between adulterated and pure samples.
Resumo:
The melting of high-latitude permafrost peatlands is a major concern due to a potential positive feedback on global climate change. We examine the ecology of testate amoebae in permafrost peatlands, based on sites in Sweden (~ 200 km north of the Arctic Circle). Multivariate statistical analysis confirms that water-table depth and moisture content are the dominant controls on the distribution of testate amoebae, corroborating the results from studies in mid-latitude peatlands. We present a new testate amoeba-based water table transfer function and thoroughly test it for the effects of spatial autocorrelation, clustered sampling design and uneven sampling gradients. We find that the transfer function has good predictive power; the best-performing model is based on tolerance-downweighted weighted averaging with inverse deshrinking (performance statistics with leave-one-out cross validation: R2 = 0.87, RMSEP = 5.25 cm). The new transfer function was applied to a short core from Stordalen mire, and reveals a major shift in peatland ecohydrology coincident with the onset of the Little Ice Age (c. AD 1400). We also applied the model to an independent contemporary dataset from Stordalen and find that it outperforms predictions based on other published transfer functions. The new transfer function will enable palaeohydrological reconstruction from permafrost peatlands in Northern Europe, thereby permitting greatly improved understanding of the long-term ecohydrological dynamics of these important carbon stores as well as their responses to recent climate change.
Resumo:
We present a novel method for the light-curve characterization of Pan-STARRS1 Medium Deep Survey (PS1 MDS) extragalactic sources into stochastic variables (SVs) and burst-like (BL) transients, using multi-band image-differencing time-series data. We select detections in difference images associated with galaxy hosts using a star/galaxy catalog extracted from the deep PS1 MDS stacked images, and adopt a maximum a posteriori formulation to model their difference-flux time-series in four Pan-STARRS1 photometric bands gP1, rP1, iP1, and zP1. We use three deterministic light-curve models to fit BL transients; a Gaussian, a Gamma distribution, and an analytic supernova (SN) model, and one stochastic light-curve model, the Ornstein-Uhlenbeck process, in order to fit variability that is characteristic of active galactic nuclei (AGNs). We assess the quality of fit of the models band-wise and source-wise, using their estimated leave-out-one cross-validation likelihoods and corrected Akaike information criteria. We then apply a K-means clustering algorithm on these statistics, to determine the source classification in each band. The final source classification is derived as a combination of the individual filter classifications, resulting in two measures of classification quality, from the averages across the photometric filters of (1) the classifications determined from the closest K-means cluster centers, and (2) the square distances from the clustering centers in the K-means clustering spaces. For a verification set of AGNs and SNe, we show that SV and BL occupy distinct regions in the plane constituted by these measures. We use our clustering method to characterize 4361 extragalactic image difference detected sources, in the first 2.5 yr of the PS1 MDS, into 1529 BL, and 2262 SV, with a purity of 95.00% for AGNs, and 90.97% for SN based on our verification sets. We combine our light-curve classifications with their nuclear or off-nuclear host galaxy offsets, to define a robust photometric sample of 1233 AGNs and 812 SNe. With these two samples, we characterize their variability and host galaxy properties, and identify simple photometric priors that would enable their real-time identification in future wide-field synoptic surveys.
Resumo:
As the complexity of computing systems grows, reliability and energy are two crucial challenges asking for holistic solutions. In this paper, we investigate the interplay among concurrency, power dissipation, energy consumption and voltage-frequency scaling for a key numerical kernel for the solution of sparse linear systems. Concretely, we leverage a task-parallel implementation of the Conjugate Gradient method, equipped with an state-of-the-art pre-conditioner embedded in the ILUPACK software, and target a low-power multi core processor from ARM.In addition, we perform a theoretical analysis on the impact of a technique like Near Threshold Voltage Computing (NTVC) from the points of view of increased hardware concurrency and error rate.
Novel Metabolite Biomarkers of Huntington's Disease As Detected by High-Resolution Mass Spectrometry
Resumo:
Huntington's disease (HD) is a fatal autosomal-dominant neurodegenerative disorder that affects approximately 3-10 people per 100 000 in the Western world. The median age of onset is 40 years, with death typically following 15-20 years later. In this study, we biochemically profiled post-mortem frontal lobe and striatum from HD sufferers (n = 14) and compared their profiles with controls (n = 14). LC-LTQ-Orbitrap-MS detected a total of 5579 and 5880 features for frontal lobe and striatum, respectively. An ROC curve combining two spectral features from frontal lobe had an AUC value of 0.916 (0.794 to 1.000) and following statistical cross-validation had an 83% predictive accuracy for HD. Similarly, two striatum biomarkers gave an ROC AUC of 0.935 (0.806 to 1.000) and after statistical cross-validation predicted HD with 91.8% accuracy. A range of metabolite disturbances were evident including but-2-enoic acid and uric acid, which were altered in both frontal lobe and striatum. A total of seven biochemical pathways (three in frontal lobe and four in striatum) were significantly altered as a result of HD. This study highlights the utility of high-resolution metabolomics for the study of HD. Further characterization of the brain metabolome could lead to the identification of new biomarkers and novel treatment strategies for HD.
Resumo:
Urothelial cancer (UC) is highly recurrent and can progress from non-invasive (NMIUC) to a more aggressive muscle-invasive (MIUC) subtype that invades the muscle tissue layer of the bladder. We present a proof of principle study that network-based features of gene pairs can be used to improve classifier performance and the functional analysis of urothelial cancer gene expression data. In the first step of our procedure each individual sample of a UC gene expression dataset is inflated by gene pair expression ratios that are defined based on a given network structure. In the second step an elastic net feature selection procedure for network-based signatures is applied to discriminate between NMIUC and MIUC samples. We performed a repeated random subsampling cross validation in three independent datasets. The network signatures were characterized by a functional enrichment analysis and studied for the enrichment of known cancer genes. We observed that the network-based gene signatures from meta collections of proteinprotein interaction (PPI) databases such as CPDB and the PPI databases HPRD and BioGrid improved the classification performance compared to single gene based signatures. The network based signatures that were derived from PPI databases showed a prominent enrichment of cancer genes (e.g., TP53, TRIM27 and HNRNPA2Bl). We provide a novel integrative approach for large-scale gene expression analysis for the identification and development of novel diagnostical targets in bladder cancer. Further, our method allowed to link cancer gene associations to network-based expression signatures that are not observed in gene-based expression signatures.