901 resultados para QUANTILE REGRESSION


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Assessing the fit of a model is an important final step in any statistical analysis, but this is not straightforward when complex discrete response models are used. Cross validation and posterior predictions have been suggested as methods to aid model criticism. In this paper a comparison is made between four methods of model predictive assessment in the context of a three level logistic regression model for clinical mastitis in dairy cattle; cross validation, a prediction using the full posterior predictive distribution and two “mixed” predictive methods that incorporate higher level random effects simulated from the underlying model distribution. Cross validation is considered a gold standard method but is computationally intensive and thus a comparison is made between posterior predictive assessments and cross validation. The analyses revealed that mixed prediction methods produced results close to cross validation whilst the full posterior predictive assessment gave predictions that were over-optimistic (closer to the observed disease rates) compared with cross validation. A mixed prediction method that simulated random effects from both higher levels was best at identifying the outlying level two (farm-year) units of interest. It is concluded that this mixed prediction method, simulating random effects from both higher levels, is straightforward and may be of value in model criticism of multilevel logistic regression, a technique commonly used for animal health data with a hierarchical structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bahadur representation and its applications have attracted a large number of publications and presentations on a wide variety of problems. Mixing dependency is weak enough to describe the dependent structure of random variables, including observations in time series and longitudinal studies. This note proves the Bahadur representation of sample quantiles for strongly mixing random variables (including ½-mixing and Á-mixing) under very weak mixing coe±cients. As application, the asymptotic normality is derived. These results greatly improves those recently reported in literature.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For climate risk management, cumulative distribution functions (CDFs) are an important source of information. They are ideally suited to compare probabilistic forecasts of primary (e.g. rainfall) or secondary data (e.g. crop yields). Summarised as CDFs, such forecasts allow an easy quantitative assessment of possible, alternative actions. Although the degree of uncertainty associated with CDF estimation could influence decisions, such information is rarely provided. Hence, we propose Cox-type regression models (CRMs) as a statistical framework for making inferences on CDFs in climate science. CRMs were designed for modelling probability distributions rather than just mean or median values. This makes the approach appealing for risk assessments where probabilities of extremes are often more informative than central tendency measures. CRMs are semi-parametric approaches originally designed for modelling risks arising from time-to-event data. Here we extend this original concept beyond time-dependent measures to other variables of interest. We also provide tools for estimating CDFs and surrounding uncertainty envelopes from empirical data. These statistical techniques intrinsically account for non-stationarities in time series that might be the result of climate change. This feature makes CRMs attractive candidates to investigate the feasibility of developing rigorous global circulation model (GCM)-CRM interfaces for provision of user-relevant forecasts. To demonstrate the applicability of CRMs, we present two examples for El Ni ? no/Southern Oscillation (ENSO)-based forecasts: the onset date of the wet season (Cairns, Australia) and total wet season rainfall (Quixeramobim, Brazil). This study emphasises the methodological aspects of CRMs rather than discussing merits or limitations of the ENSO-based predictors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, a plethora of approaches have been proposed to deal with the increasingly challenging task of multi-output regression. This paper provides a survey on state-of-the-art multi-output regression methods, that are categorized as problem transformation and algorithm adaptation methods. In addition, we present the mostly used performance evaluation measures, publicly available data sets for multi-output regression real-world problems, as well as open-source software frameworks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The reliable and efficient design of steel-fibre-reinforced concrete (SFRC) structures requires clear knowledge of material properties. Since the locations and orientations of aggregates and fibres in concrete are intrinsically random, testing results from different specimens vary, and it needs hundreds or even thousands of specimens and tests to derive the unbiased statistical distributions of material properties by using traditional statistical techniques. Therefore, few statistical studies on the SFRC material properties can be found in literature. In this study, high-rate impact test results on SFRC using split Hopkinson pressure bar are further analysed. The influences of different strain rates and various volume fractions of fibres on compressive strength of SFRC specimens under dynamic loadings will be quantified, by using kernel regression, a kernel-based nonparametric statistical method. Several kernel estimators and functions will be compared. This technique allows one to derive an unbiased statistical estimation from limited testing data. Therefore it is especially useful when the testing data is limited.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Support for the saying “a picture is worth a 1000 words’ has been consistently found within statistics education. Graphical images are effective in promoting understanding and communication of statistical concepts and results to a variety of audiences. The computer software package, AMOS, was developed for the analysis of structural equation models (SEM) and has a user-friendly graphical interface. However, courses in SEM are generally found only at the postgraduate level. This paper argues that the graphical interface of AMOS has the potential to enhance conceptual understanding and communication of results in undergraduate statistical courses. More specifically, approaches to the teaching and communication of results of multiple regression models when using SPSS and AMOS will be examined and compared.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to terrorist locations such as safe-houses (rather than their targets or training sites) are rare and possibly nonexistent. At the time of this research, there were no publically available models designed to predict locations where violent extremists are likely to reside. This research uses France as a case study to present a complex systems model that incorporates multiple quantitative, qualitative and geospatial variables that differ in terms of scale, weight, and type. Though many of these variables are recognized by specialists in security studies, there remains controversy with respect to their relative importance, degree of interaction, and interdependence. Additionally, some of the variables proposed in this research are not generally recognized as drivers, yet they warrant examination based on their potential role within a complex system. This research tested multiple regression models and determined that geographically-weighted regression analysis produced the most accurate result to accommodate non-stationary coefficient behavior, demonstrating that geographic variables are critical to understanding and predicting the phenomenon of terrorism. This dissertation presents a flexible prototypical model that can be refined and applied to other regions to inform stakeholders such as policy-makers and law enforcement in their efforts to improve national security and enhance quality-of-life.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp, 2016. © 2016 Wiley Periodicals, Inc.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A growing literature documents the existence of strategic political reactions to publicexpenditure between rival jurisdictions. These interactions can potentially createa downward expenditure spiral (“race to the bottom”) or a rising expenditure spiral(“race to the top”). However, in the course of identifying the existence of such interactions and ascertaining their underlying triggers, the empirical evidence has produced markedly heterogeneous findings. Most of this heterogeneity can be traced back to study design and institutional differences. This article contributes to the literature by applying meta-regression analysis to quantify the magnitude of strategic inter-jurisdictional expenditure interactions, controlling for study, and institutional characteristics. We find several robust results beyond confirming that jurisdictions do engage in strategic expenditure interactions, namely that strategic interactions: (i) are weakening over time, (ii) are stronger among municipalities than among higher levels of government, and (iii) appear to be more influenced from tax competition than yardstick competition, with capital controls and fiscal decentralization shaping the magnitude of fiscal interactions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article examines the empirical support for the hypothesized hedonic theoretical relation between the price of wine and its quality. The examination considers over 180 hedonic wine price models developed over 20 years, covering many countries. The research identifies that the relation between the price of wine and its sensory quality rating is a moderate partial correlation of +0.30. This correlation exists despite the lack of information held by consumers about a wine’s quality and the inconsistency of expert tasters when evaluating wines. The results identify a moderate price-quality correlation, which suggests the existence of strategic buying opportunities for better informed consumers. Strategic price setting possibilities may also exist for wine producers given the incomplete quality information held by consumers. The results from the meta-regression analysis point to the absence of any publication bias, and attribute the observed asymmetry in estimates to study heterogeneity. The analysis suggests the observed heterogeneity is explained by the importance of a wine’s reputation, the use of the 100-point quality rating scale, the analysis of a single wine variety/style, and the employed functional form. The most important implication from the analysis is the relative importance of a wine’s reputation over its sensory quality, inferring that producers need to sustain the sensory quality of a wine over time to extract appropriate returns. The reputation of the wine producer is found not to influence the strength of the price quality relationship. This finding does not contradict the importance of wine producer reputation in directly influencing prices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of‘mixed-effects’ or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the ‘true’ regression coefficient.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

When the distribution of a process characterized by a profile is non normal, process capability analysis using normal assumption often leads to erroneous interpretations of the process performance. Profile monitoring is a relatively new set of techniques in quality control that is used in situations where the state of product or process is represented by a function of two or more quality characteristics. Such profiles can be modeled using linear or nonlinear regression models. In some applications, it is assumed that the quality characteristics follow a normal distribution; however, in certain applications this assumption may fail to hold and may yield misleading results. In this article, we consider process capability analysis of non normal linear profiles. We investigate and compare five methods to estimate non normal process capability index (PCI) in profiles. In three of the methods, an estimation of the cumulative distribution function (cdf) of the process is required to analyze process capability in profiles. In order to estimate cdf of the process, we use a Burr XII distribution as well as empirical distributions. However, the resulted PCI with estimating cdf of the process is sometimes far from its true value. So, here we apply artificial neural network with supervised learning which allows the estimation of PCIs in profiles without the need to estimate cdf of the process. Box-Cox transformation technique is also developed to deal with non normal situations. Finally, a comparison study is performed through the simulation of Gamma, Weibull, Lognormal, Beta and student-t data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Malware replicates itself and produces offspring with the same characteristics but different signatures by using code obfuscation techniques. Current generation anti-virus engines employ a signature-template type detection approach where malware can easily evade existing signatures in the database. This reduces the capability of current anti-virus engines in detecting malware. In this paper, we propose a stepwise binary logistic regression-based dimensionality reduction techniques for malware detection using application program interface (API) call statistics. Finding the most significant malware feature using traditional wrapper-based approaches takes an exponential complexity of the dimension (m) of the dataset with a brute-force search strategies and order of (m-1) complexity with a backward elimination filter heuristics. The novelty of the proposed approach is that it finds the worst case computational complexity which is less than order of (m-1). The proposed approach uses multi-linear regression and the p-value of each individual API feature for selection of the most uncorrelated and significant features in order to reduce the dimensionality of the large malware data and to ensure the absence of multi-collinearity. The stepwise logistic regression approach is then employed to test the significance of the individual malware feature based on their corresponding Wald statistic and to construct the binary decision the model. When the selected most significant APIs are used in a decision rule generation systems, this approach not only reduces the tree size but also improves classification performance. Exhaustive experiments on a large malware data set show that the proposed approach clearly exceeds the existing standard decision rule, support vector machine-based template approach with complete data and provides a better statistical fitness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.