978 resultados para genetics, statistical genetics, variable models


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper analyzes the measure of systemic importance ∆CoV aR proposed by Adrian and Brunnermeier (2009, 2010) within the context of a similar class of risk measures used in the risk management literature. In addition, we develop a series of testing procedures, based on ∆CoV aR, to identify and rank the systemically important institutions. We stress the importance of statistical testing in interpreting the measure of systemic importance. An empirical application illustrates the testing procedures, using equity data for three European banks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A significant challenge in the prediction of climate change impacts on ecosystems and biodiversity is quantifying the sources of uncertainty that emerge within and between different models. Statistical species niche models have grown in popularity, yet no single best technique has been identified reflecting differing performance in different situations. Our aim was to quantify uncertainties associated with the application of 2 complimentary modelling techniques. Generalised linear mixed models (GLMM) and generalised additive mixed models (GAMM) were used to model the realised niche of ombrotrophic Sphagnum species in British peatlands. These models were then used to predict changes in Sphagnum cover between 2020 and 2050 based on projections of climate change and atmospheric deposition of nitrogen and sulphur. Over 90% of the variation in the GLMM predictions was due to niche model parameter uncertainty, dropping to 14% for the GAMM. After having covaried out other factors, average variation in predicted values of Sphagnum cover across UK peatlands was the next largest source of variation (8% for the GLMM and 86% for the GAMM). The better performance of the GAMM needs to be weighed against its tendency to overfit the training data. While our niche models are only a first approximation, we used them to undertake a preliminary evaluation of the relative importance of climate change and nitrogen and sulphur deposition and the geographic locations of the largest expected changes in Sphagnum cover. Predicted changes in cover were all small (generally <1% in an average 4 m2 unit area) but also highly uncertain. Peatlands expected to be most affected by climate change in combination with atmospheric pollution were Dartmoor, Brecon Beacons and the western Lake District.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Accurate decadal climate predictions could be used to inform adaptation actions to a changing climate. The skill of such predictions from initialised dynamical global climate models (GCMs) may be assessed by comparing with predictions from statistical models which are based solely on historical observations. This paper presents two benchmark statistical models for predicting both the radiatively forced trend and internal variability of annual mean sea surface temperatures (SSTs) on a decadal timescale based on the gridded observation data set HadISST. For both statistical models, the trend related to radiative forcing is modelled using a linear regression of SST time series at each grid box on the time series of equivalent global mean atmospheric CO2 concentration. The residual internal variability is then modelled by (1) a first-order autoregressive model (AR1) and (2) a constructed analogue model (CA). From the verification of 46 retrospective forecasts with start years from 1960 to 2005, the correlation coefficient for anomaly forecasts using trend with AR1 is greater than 0.7 over parts of extra-tropical North Atlantic, the Indian Ocean and western Pacific. This is primarily related to the prediction of the forced trend. More importantly, both CA and AR1 give skillful predictions of the internal variability of SSTs in the subpolar gyre region over the far North Atlantic for lead time of 2 to 5 years, with correlation coefficients greater than 0.5. For the subpolar gyre and parts of the South Atlantic, CA is superior to AR1 for lead time of 6 to 9 years. These statistical forecasts are also compared with ensemble mean retrospective forecasts by DePreSys, an initialised GCM. DePreSys is found to outperform the statistical models over large parts of North Atlantic for lead times of 2 to 5 years and 6 to 9 years, however trend with AR1 is generally superior to DePreSys in the North Atlantic Current region, while trend with CA is superior to DePreSys in parts of South Atlantic for lead time of 6 to 9 years. These findings encourage further development of benchmark statistical decadal prediction models, and methods to combine different predictions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Logistic models are studied as a tool to convert dynamical forecast information (deterministic and ensemble) into probability forecasts. A logistic model is obtained by setting the logarithmic odds ratio equal to a linear combination of the inputs. As with any statistical model, logistic models will suffer from overfitting if the number of inputs is comparable to the number of forecast instances. Computational approaches to avoid overfitting by regularization are discussed, and efficient techniques for model assessment and selection are presented. A logit version of the lasso (originally a linear regression technique), is discussed. In lasso models, less important inputs are identified and the corresponding coefficient is set to zero, providing an efficient and automatic model reduction procedure. For the same reason, lasso models are particularly appealing for diagnostic purposes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The usual practice in using a control chart to monitor a process is to take samples of size n from the process every h hours. This article considers the properties of the X̄ chart when the size of each sample depends on what is observed in the preceding sample. The idea is that the sample should be large if the sample point of the preceding sample is close to but not actually outside the control limits and small if the sample point is close to the target. The properties of the variable sample size (VSS) X̄ chart are obtained using Markov chains. The VSS X̄ chart is substantially quicker than the traditional X̄ chart in detecting moderate shifts in the process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent theoretical studies have shown that the X̄ chart with variable sampling intervals (VSI) and the X̄ chart with variable sample size (VSS) are quicker than the traditional X̄ chart in detecting shifts in the process. This article considers the X̄ chart with variable sample size and sampling intervals (VSSI). It is assumed that the amount of time the process remains in control has exponential distribution. The properties of the VSSI X̄ chart are obtained using Markov chains. The VSSI X̄ chart is even quicker than the VSI or VSS X̄ charts in detecting moderate shifts in the process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent studies indicate that polymorphic genetic markers are potentially helpful in resolving genealogical relationships among individuals in a natural population. Genetic data provide opportunities for paternity exclusion when genotypic incompatibilities are observed among individuals, and the present investigation examines the resolving power of genetic markers in unambiguous positive determination of paternity. Under the assumption that the mother for each offspring in a population is unambiguously known, an analytical expression for the fraction of males excluded from paternity is derived for the case where males and females may be derived from two different gene pools. This theoretical formulation can also be used to predict the fraction of births for each of which all but one male can be excluded from paternity. We show that even when the average probability of exclusion approaches unity, a substantial fraction of births yield equivocal mother-father-offspring determinations. The number of loci needed to increase the frequency of unambiguous determinations to a high level is beyond the scope of current electrophoretic studies in most species. Applications of this theory to electrophoretic data on Chamaelirium luteum (L.) shows that in 2255 offspring derived from 273 males and 70 females, only 57 triplets could be unequivocally determined with eight polymorphic protein loci, even though the average combined exclusionary power of these loci was 73%. The distribution of potentially compatible male parents, based on multilocus genotypes, was reasonably well predicted from the allele frequency data available for these loci. We demonstrate that genetic paternity analysis in natural populations cannot be reliably based on exclusionary principles alone. In order to measure the reproductive contributions of individuals in natural populations, more elaborate likelihood principles must be deployed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, disaster preparedness through assessment of medical and special needs persons (MSNP) has taken a center place in public eye in effect of frequent natural disasters such as hurricanes, storm surge or tsunami due to climate change and increased human activity on our planet. Statistical methods complex survey design and analysis have equally gained significance as a consequence. However, there exist many challenges still, to infer such assessments over the target population for policy level advocacy and implementation. ^ Objective. This study discusses the use of some of the statistical methods for disaster preparedness and medical needs assessment to facilitate local and state governments for its policy level decision making and logistic support to avoid any loss of life and property in future calamities. ^ Methods. In order to obtain precise and unbiased estimates for Medical Special Needs Persons (MSNP) and disaster preparedness for evacuation in Rio Grande Valley (RGV) of Texas, a stratified and cluster-randomized multi-stage sampling design was implemented. US School of Public Health, Brownsville surveyed 3088 households in three counties namely Cameron, Hidalgo, and Willacy. Multiple statistical methods were implemented and estimates were obtained taking into count probability of selection and clustering effects. Statistical methods for data analysis discussed were Multivariate Linear Regression (MLR), Survey Linear Regression (Svy-Reg), Generalized Estimation Equation (GEE) and Multilevel Mixed Models (MLM) all with and without sampling weights. ^ Results. Estimated population for RGV was 1,146,796. There were 51.5% female, 90% Hispanic, 73% married, 56% unemployed and 37% with their personal transport. 40% people attained education up to elementary school, another 42% reaching high school and only 18% went to college. Median household income is less than $15,000/year. MSNP estimated to be 44,196 (3.98%) [95% CI: 39,029; 51,123]. All statistical models are in concordance with MSNP estimates ranging from 44,000 to 48,000. MSNP estimates for statistical methods are: MLR (47,707; 95% CI: 42,462; 52,999), MLR with weights (45,882; 95% CI: 39,792; 51,972), Bootstrap Regression (47,730; 95% CI: 41,629; 53,785), GEE (47,649; 95% CI: 41,629; 53,670), GEE with weights (45,076; 95% CI: 39,029; 51,123), Svy-Reg (44,196; 95% CI: 40,004; 48,390) and MLM (46,513; 95% CI: 39,869; 53,157). ^ Conclusion. RGV is a flood zone, most susceptible to hurricanes and other natural disasters. People in the region are mostly Hispanic, under-educated with least income levels in the U.S. In case of any disaster people in large are incapacitated with only 37% have their personal transport to take care of MSNP. Local and state government’s intervention in terms of planning, preparation and support for evacuation is necessary in any such disaster to avoid loss of precious human life. ^ Key words: Complex Surveys, statistical methods, multilevel models, cluster randomized, sampling weights, raking, survey regression, generalized estimation equations (GEE), random effects, Intracluster correlation coefficient (ICC).^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a new technique in the investigation of limited-dependent variable models. This paper illustrates that variable precision rough set theory (VPRS), allied with the use of a modern method of classification, or discretisation of data, can out-perform the more standard approaches that are employed in economics, such as a probit model. These approaches and certain inductive decision tree methods are compared (through a Monte Carlo simulation approach) in the analysis of the decisions reached by the UK Monopolies and Mergers Committee. We show that, particularly in small samples, the VPRS model can improve on more traditional models, both in-sample, and particularly in out-of-sample prediction. A similar improvement in out-of-sample prediction over the decision tree methods is also shown.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of social diffusion has animated sociological thinking on topics ranging from the spread of an idea, an innovation or a disease, to the foundations of collective behavior and political polarization. While network diffusion has been a productive metaphor, the reality of diffusion processes is often muddier. Ideas and innovations diffuse differently from diseases, but, with a few exceptions, the diffusion of ideas and innovations has been modeled under the same assumptions as the diffusion of disease. In this dissertation, I develop two new diffusion models for "socially meaningful" contagions that address two of the most significant problems with current diffusion models: (1) that contagions can only spread along observed ties, and (2) that contagions do not change as they spread between people. I augment insights from these statistical and simulation models with an analysis of an empirical case of diffusion - the use of enterprise collaboration software in a large technology company. I focus the empirical study on when people abandon innovations, a crucial, and understudied aspect of the diffusion of innovations. Using timestamped posts, I analyze when people abandon software to a high degree of detail.

To address the first problem, I suggest a latent space diffusion model. Rather than treating ties as stable conduits for information, the latent space diffusion model treats ties as random draws from an underlying social space, and simulates diffusion over the social space. Theoretically, the social space model integrates both actor ties and attributes simultaneously in a single social plane, while incorporating schemas into diffusion processes gives an explicit form to the reciprocal influences that cognition and social environment have on each other. Practically, the latent space diffusion model produces statistically consistent diffusion estimates where using the network alone does not, and the diffusion with schemas model shows that introducing some cognitive processing into diffusion processes changes the rate and ultimate distribution of the spreading information. To address the second problem, I suggest a diffusion model with schemas. Rather than treating information as though it is spread without changes, the schema diffusion model allows people to modify information they receive to fit an underlying mental model of the information before they pass the information to others. Combining the latent space models with a schema notion for actors improves our models for social diffusion both theoretically and practically.

The empirical case study focuses on how the changing value of an innovation, introduced by the innovations' network externalities, influences when people abandon the innovation. In it, I find that people are least likely to abandon an innovation when other people in their neighborhood currently use the software as well. The effect is particularly pronounced for supervisors' current use and number of supervisory team members who currently use the software. This case study not only points to an important process in the diffusion of innovation, but also suggests a new approach -- computerized collaboration systems -- to collecting and analyzing data on organizational processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The three studies in this thesis focus on happiness and age and seek to contribute to our understanding of happiness change over the lifetime. The first study contributes by offering an explanation for what was evolving to a ‘stylised fact’ in the economics literature, the U-shape of happiness in age. No U-shape is evident if one makes a visual inspection of the age happiness relationship in the German socio-economic panel data, and, it seems counter-intuitive that we just have to wait until we get old to be happy. Eliminating the very young, the very old, and the first timers from the analysis did not explain away regression results supporting the U-shape of happiness in age, but fixed effect analysis did. Analysis revealed found that reverse causality arising from time-invariant individual traits explained the U-shape of happiness in age in the German population, and the results were robust across six econometric methods. Robustness was added to the German fixed effect finding by replicating it with the Australian and the British socio-economic panel data sets. During analysis of the German data an unexpected finding emerged, an exceedingly large negative linear effect of age on happiness in fixed-effect regressions. There is a large self-reported happiness decline by those who remain in the German panel. A similar decline over time was not evident in the Australian or the British data. After testing away age, time and cohort effects, a time-in-panel effect was found. Germans who remain in the panel for longer progressively report lower levels of happiness. Because time-in-panel effects have not been included in happiness regression specifications, our estimates may be biased; perhaps some economics of the happiness studies, that used German panel data, need revisiting. The second study builds upon the fixed-effect finding of the first study and extends our view of lifetime happiness to a cohort little visited by economists, children. Initial analysis extends our view of lifetime happiness beyond adulthood and revealed a happiness decline in adolescent (15 to 23 year-old) Australians that is twice the size of the happiness decline we see in older Australians (75 to 86 yearolds), who we expect to be unhappy due to declining income, failing health and the onset of death. To resolve a difference of opinion in the literature as to whether childhood happiness decreases, increases, or remains flat in age; survey instruments and an Internet-based survey were developed and used to collect data from four hundred 9 to 14 year-old Australian children. Applying the data to a Model of Childhood Happiness revealed that the natural environment life-satisfaction domain factor did not have a significant effect on childhood happiness. However, the children’s school environment and interactions with friends life-satisfaction domain factors explained over half a steep decline in childhood happiness that is three times larger than what we see in older Australians. Adding personality to the model revealed what we expect to see with adults, extraverted children are happier, but unexpectedly, so are conscientious children. With the steep decline in the happiness of young Australians revealed and explanations offered, the third study builds on the time-invariant individual trait finding from the first study by applying the Australian panel data to an Aggregate Model of Average Happiness over the lifetime. The model’s independent variable is the stress that arises from the interaction between personality and the life event shocks that affect individuals and peers throughout their lives. Interestingly, a graphic depiction of the stress in age relationship reveals an inverse U-shape; an inverse U-shape that looks like the opposite of the U-shape of happiness in age we saw in the first study. The stress arising from life event shocks is found to explain much of the change in average happiness over a lifetime. With the policy recommendations of economists potentially invoking unexpected changes in our lives, the ensuing stress and resulting (un)happiness warrant consideration before economists make policy recommendations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we address the puzzle of the relationship between age and happiness. Whilst the majority of psychologists have concluded there is not much of a relationship at all, the economic literature has unearthed a possible U-shape relationship with the minimum level of satisfaction occurring in middle age (35–50). In this paper, we look for a U-shape in three panel data sets, the German Socioeconomic Panel (GSOEP), the British Household Panel Survey (BHPS) and the Household Income Labour Dynamics Australia (HILDA). We find that the raw data mainly supports a wave-like shape that only weakly looks U-shaped for the 20–60 age range. That weak U-shape in middle age becomes more pronounced when allowing for socio-economic variables. When we then take account of selection effects via fixed-effects, however, the dominant age-effect in all three panels is a strong happiness increase around the age of 60 followed by a major decline after 75, with the U-shape in middle age disappearing such that there is almost no change in happiness between the age of 20 and 50.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study constructs performance prediction models to estimate the end-user perceived video quality on mobile devices for the latest video encoding techniques –VP9 and H.265. Both subjective and objective video quality assessments were carried out for collecting data and selecting the most desirable predictors. Using statistical regression, two models were generated to achieve 94.5% and 91.5% of prediction accuracies respectively, depending on whether the predictor derived from the objective assessment is involved. These proposed models can be directly used by media industries for video quality estimation, and will ultimately help them to ensure a positive end-user quality of experience on future mobile devices after the adaptation of the latest video encoding technologies.