43 resultados para statistical methods


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: The inference of regulatory networks from large-scale expression data holds great promise because of the potentially causal interpretation of these networks. However, due to the difficulty to establish reliable methods based on observational data there is so far only incomplete knowledge about possibilities and limitations of such inference methods in this context.

Results: In this article, we conduct a statistical analysis investigating differences and similarities of four network inference algorithms, ARACNE, CLR, MRNET and RN, with respect to local network-based measures. We employ ensemble methods allowing to assess the inferability down to the level of individual edges. Our analysis reveals the bias of these inference methods with respect to the inference of various network components and, hence, provides guidance in the interpretation of inferred regulatory networks from expression data. Further, as application we predict the total number of regulatory interactions in human B cells and hypothesize about the role of Myc and its targets regarding molecular information processing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wind power generation differs from conventional thermal generation due to the stochastic nature of wind. Thus wind power forecasting plays a key role in dealing with the challenges of balancing supply and demand in any electricity system, given the uncertainty associated with the wind farm power output. Accurate wind power forecasting reduces the need for additional balancing energy and reserve power to integrate wind power. Wind power forecasting tools enable better dispatch, scheduling and unit commitment of thermal generators, hydro plant and energy storage plant and more competitive market trading as wind power ramps up and down on the grid. This paper presents an in-depth review of the current methods and advances in wind power forecasting and prediction. Firstly, numerical wind prediction methods from global to local scales, ensemble forecasting, upscaling and downscaling processes are discussed. Next the statistical and machine learning approach methods are detailed. Then the techniques used for benchmarking and uncertainty analysis of forecasts are overviewed, and the performance of various approaches over different forecast time horizons is examined. Finally, current research activities, challenges and potential future developments are appraised. (C) 2011 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modern biology and medicine aim at hunting molecular and cellular causes of biological functions and diseases. Gene regulatory networks (GRN) inferred from gene expression data are considered an important aid for this research by providing a map of molecular interactions. Hence, GRNs have the potential enabling and enhancing basic as well as applied research in the life sciences. In this paper, we introduce a new method called BC3NET for inferring causal gene regulatory networks from large-scale gene expression data. BC3NET is an ensemble method that is based on bagging the C3NET algorithm, which means it corresponds to a Bayesian approach with noninformative priors. In this study we demonstrate for a variety of simulated and biological gene expression data from S. cerevisiae that BC3NET is an important enhancement over other inference methods that is capable of capturing biochemical interactions from transcription regulation and protein-protein interaction sensibly. An implementation of BC3NET is freely available as an R package from the CRAN repository. © 2012 de Matos Simoes, Emmert-Streib.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

GC-MS data on veterinary drug residues in bovine urine are used for controlling the illegal practice of fattening cattle. According to current detection criteria, peak patterns of preferably four ions should agree within 10 or 20% from a corresponding standard pattern. These criteria are rigid, rather arbitrary and do not match daily practice. A new model, based on multivariate modeling of log peak abundance ratios, provides a theoretical basis for the identification of analytes and optimizes the balance between the avoidance of false positives and false negatives. The performance of the model is demonstrated on data provided by five laboratories, each supplying GC-MS measurements on the detection of clenbuterol, dienestrol and 19 beta-nortestosterone in urine. The proposed model shows a better performance than confirmation by using the current criteria and provides a statistical basis for inspection criteria in terms of error probabilities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this article, we focus on the analysis of competitive gene set methods for detecting the statistical significance of pathways from gene expression data. Our main result is to demonstrate that some of the most frequently used gene set methods, GSEA, GSEArot and GAGE, are severely influenced by the filtering of the data in a way that such an analysis is no longer reconcilable with the principles of statistical inference, rendering the obtained results in the worst case inexpressive. A possible consequence of this is that these methods can increase their power by the addition of unrelated data and noise. Our results are obtained within a bootstrapping framework that allows a rigorous assessment of the robustness of results and enables power estimates. Our results indicate that when using competitive gene set methods, it is imperative to apply a stringent gene filtering criterion. However, even when genes are filtered appropriately, for gene expression data from chips that do not provide a genome-scale coverage of the expression values of all mRNAs, this is not enough for GSEA, GSEArot and GAGE to ensure the statistical soundness of the applied procedure. For this reason, for biomedical and clinical studies, we strongly advice not to use GSEA, GSEArot and GAGE for such data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces the paired comparison model as a suitable approach for the analysis of partially ranked data. For example, the Inglehart index, collected in international social surveys to examine shifts in post-materialistic values, generates such data on a set of attitude items. However, current analysis methods have failed to account for the complex shifts in individual item values, or to incorporate subject covariates. The paired comparison model is thus developed to allow for covariate subject effects at the individual level, and a reparameterization allows the inclusion of smooth non-linear effects of continuous covariates. The Inglehart index collected in the 1993 International Social Science Programme survey is analysed, and complex non-linear changes of item values with age, level of education and religion are identified. The model proposed provides a powerful tool for social scientists.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Large samples of multiplex pedigrees will probably be needed to detect susceptibility loci for schizophrenia by linkage analysis. Standardized ascertainment of such pedigrees from culturally and ethnically homogeneous populations may improve the probability of detection and replication of linkage. The Irish Study of High-Density Schizophrenia Families (ISHDSF) was formed from standardized ascertainment of multiplex schizophrenia families in 39 psychiatric facilities covering over 90% of the population in Ireland and Northern Ireland. We here describe a phenotypic sample and a subset thereof, the linkage sample. Individuals were included in the phenotypic sample if adequate diagnostic information, based on personal interview and/or hospital record, was available. Only individuals with available DNA were included in the linkage sample. Inclusion of a pedigree into the phenotypic sample required at least two first, second, or third degree relatives with non-affective psychosis (NAP), one whom had schizophrenia (S) or poor-outcome schizo-affective disorder (PO-SAD). Entry into the linkage sample required DNA samples on at least two individuals with NAP, of whom at least one had S or PO-SAD. Affection was defined by narrow, intermediate, and broad criteria. The phenotypic sample contained 277 pedigrees and 1,770 individuals and the linkage sample 265 pedigrees and 1,408 individuals. Using the intermediate definition of affection, the phenotypic sample contained 837 affected individuals and 526 affected sibling pairs. Parallel figures for the linkage sample were 700 and 420. Individuals with schizophrenia from these multiplex pedigrees resembled epidemiologically sampled cases with respect to age at onset, gender distribution, and most clinical symptoms, although they were more thought-disordered and had a poorer outcome. Power analyses based on the model of linkage heterogeneity indicated that the ISHDSF should be able to detect a major locus that influences susceptibility to schizophrenia in as few as 20% of families. Compared to first-degree relatives of epidemiologically sampled schizophrenic probands, first-degree relatives of schizophrenic members from the ISHDSF had a similar risk for schizotypal personality disorder, affective illness, alcoholism, and anxiety disorder. With sufficient resources, large-scale ascertainment of multiplex schizophrenia pedigrees is feasible, especially in countries with catchmented psychiatric care and stable populations. Although somewhat more severely ill, schizophrenic members of such pedigrees appear to clinically resemble typical schizophrenic patients. Our ascertainment process for multiplex schizophrenia families did not select for excess familial risk for affective illness or alcoholism. With its large sample ascertained in a standardized manner from a relatively homogeneous population, the ISHDSF provides considerable power to detect susceptibility loci for schizophrenia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The adulteration of extra virgin olive oil with other vegetable oils is a certain problem with economic and health consequences. Current official methods have been proved insufficient to detect such adulterations. One of the most concerning and undetectable adulterations with other vegetable oils is the addition of hazelnut oil. The main objective of this work was to develop a novel dimensionality reduction technique able to model oil mixtures as a part of an integrated pattern recognition solution. This final solution attempts to identify hazelnut oil adulterants in extra virgin olive oil at low percentages based on spectroscopic chemical fingerprints. The proposed Continuous Locality Preserving Projections (CLPP) technique allows the modelling of the continuous nature of the produced in house admixtures as data series instead of discrete points. This methodology has potential to be extended to other mixtures and adulterations of food products. The maintenance of the continuous structure of the data manifold lets the better visualization of this examined classification problem and facilitates a more accurate utilisation of the manifold for detecting the adulterants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical downscaling (SD) methods have become a popular, low-cost and accessible means of bridging the gap between the coarse spatial resolution at which climate models output climate scenarios and the finer spatial scale at which impact modellers require these scenarios, with various different SD techniques used for a wide range of applications across the world. This paper compares the Generator for Point Climate Change (GPCC) model and the Statistical DownScaling Model (SDSM)—two contrasting SD methods—in terms of their ability to generate precipitation series under non-stationary conditions across ten contrasting global climates. The mean, maximum and a selection of distribution statistics as well as the cumulative frequencies of dry and wet spells for four different temporal resolutions were compared between the models and the observed series for a validation period. Results indicate that both methods can generate daily precipitation series that generally closely mirror observed series for a wide range of non-stationary climates. However, GPCC tends to overestimate higher precipitation amounts, whilst SDSM tends to underestimate these. This infers that GPCC is more likely to overestimate the effects of precipitation on a given impact sector, whilst SDSM is likely to underestimate the effects. GPCC performs better than SDSM in reproducing wet and dry day frequency, which is a key advantage for many impact sectors. Overall, the mixed performance of the two methods illustrates the importance of users performing a thorough validation in order to determine the influence of simulated precipitation on their chosen impact sector.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Identifying processes that shape species geographical ranges is a prerequisite for understanding environmental change. Currently, species distribution modelling methods do not offer credible statistical tests of the relative influence of climate factors and typically ignore other processes (e.g. biotic interactions and dispersal limitation). We use a hierarchical model fitted with Markov Chain Monte Carlo to combine ecologically plausible niche structures using regression splines to describe unimodal but potentially skewed response terms. We apply spatially explicit error terms that account for (and may help identify) missing variables. Using three example distributions of European bird species, we map model results to show sensitivity to change in each covariate. We show that the overall strength of climatic association differs between species and that each species has considerable spatial variation in both the strength of the climatic association and the sensitivity to climate change. Our methods are widely applicable to many species distribution modelling problems and enable accurate assessment of the statistical importance of biotic and abiotic influences on distributions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND:
Statistical numeracy, necessary for making informed medical decisions, is reduced among older adults who make more decisions about their medical care and treatment than at any other stage of life. Objective numeracy scales are a source of anxiety among patients, heightened among older adults.
OBJECTIVE:
We investigate the subjective numeracy scale as an alternative tool for measuring statistical numeracy with older adult samples.
METHODS:
Numeracy was assessed using objective measures for 526 adults ranging in age from 18 to 93 years, and all participants provided subjective numeracy ratings.
RESULTS:
Subjective numeracy correlated highly with objective measurements among oldest adults (70+ years; r = 0.51, 95% CI 0.32, 0.66), and for younger age groups. Subjective numeracy explained 33.2% of age differences in objective numeracy.
CONCLUSION:
The subjective numeracy scale provides an effective tool for assessing statistical numeracy for broad age ranges and circumvents problems associated with objective numeracy measures.