950 resultados para Akaike information criterion
Resumo:
We carried out a discriminant analysis with identity by descent (IBD) at each marker as inputs, and the sib pair type (affected-affected versus affected-unaffected) as the output. Using simple logistic regression for this discriminant analysis, we illustrate the importance of comparing models with different number of parameters. Such model comparisons are best carried out using either the Akaike information criterion (AIC) or the Bayesian information criterion (BIC). When AIC (or BIC) stepwise variable selection was applied to the German Asthma data set, a group of markers were selected which provide the best fit to the data (assuming an additive effect). Interestingly, these 25-26 markers were not identical to those with the highest (in magnitude) single-locus lod scores.
Resumo:
This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.
Resumo:
In linear mixed models, model selection frequently includes the selection of random effects. Two versions of the Akaike information criterion (AIC) have been used, based either on the marginal or on the conditional distribution. We show that the marginal AIC is no longer an asymptotically unbiased estimator of the Akaike information, and in fact favours smaller models without random effects. For the conditional AIC, we show that ignoring estimation uncertainty in the random effects covariance matrix, as is common practice, induces a bias that leads to the selection of any random effect not predicted to be exactly zero. We derive an analytic representation of a corrected version of the conditional AIC, which avoids the high computational cost and imprecision of available numerical approximations. An implementation in an R package is provided. All theoretical results are illustrated in simulation studies, and their impact in practice is investigated in an analysis of childhood malnutrition in Zambia.
Resumo:
1. Ecological data sets often use clustered measurements or use repeated sampling in a longitudinal design. Choosing the correct covariance structure is an important step in the analysis of such data, as the covariance describes the degree of similarity among the repeated observations. 2. Three methods for choosing the covariance are: the Akaike information criterion (AIC), the quasi-information criterion (QIC), and the deviance information criterion (DIC). We compared the methods using a simulation study and using a data set that explored effects of forest fragmentation on avian species richness over 15 years. 3. The overall success was 80.6% for the AIC, 29.4% for the QIC and 81.6% for the DIC. For the forest fragmentation study the AIC and DIC selected the unstructured covariance, whereas the QIC selected the simpler autoregressive covariance. Graphical diagnostics suggested that the unstructured covariance was probably correct. 4. We recommend using DIC for selecting the correct covariance structure.
Resumo:
Maternal and infant mortality is a global health issue with a significant social and economic impact. Each year, over half a million women worldwide die due to complications related to pregnancy or childbirth, four million infants die in the first 28 days of life, and eight million infants die in the first year. Ninety-nine percent of maternal and infant deaths are in developing countries. Reducing maternal and infant mortality is among the key international development goals. In China, the national maternal mortality ratio and infant mortality rate were reduced greatly in the past two decades, yet a large discrepancy remains between urban and rural areas. To address this problem, a large-scale Safe Motherhood Programme was initiated in 2000. The programme was implemented in Guangxi in 2003. Interventions in the programme included both demand-side and supply side-interventions focusing on increasing health service use and improving birth outcomes. Little is known about the effects and economic outcomes of the Safe Motherhood Programme in Guangxi, although it has been implemented for seven years. The aim of this research is to estimate the effectiveness and cost-effectiveness of the interventions in the Safe Motherhood Programme in Guangxi, China. The objectives of this research include: 1. To evaluate whether the changes of health service use and birth outcomes are associated with the interventions in the Safe Motherhood Programme. 2. To estimate the cost-effectiveness of the interventions in the Safe Motherhood Programme and quantify the uncertainty surrounding the decision. 3. To assess the expected value of perfect information associated with both the whole decision and individual parameters, and interpret the findings to inform priority setting in further research and policy making in this area. A quasi-experimental study design was used in this research to assess the effectiveness of the programme in increasing health service use and improving birth outcomes. The study subjects were 51 intervention counties and 30 control counties. Data on the health service use, birth outcomes and socio-economic factors from 2001 to 2007 were collected from the programme database and statistical yearbooks. Based on the profile plots of the data, general linear mixed models were used to evaluate the effectiveness of the programme while controlling for the effects of baseline levels of the response variables, change of socio-economic factors over time and correlations among repeated measurements from the same county. Redundant multicollinear variables were deleted from the mixed model using the results of the multicollinearity diagnoses. For each response variable, the best covariance structure was selected from 15 alternatives according to the fit statistics including Akaike information criterion, Finite-population corrected Akaike information criterion, and Schwarz.s Bayesian information criterion. Residual diagnostics were used to validate the model assumptions. Statistical inferences were made to show the effect of the programme on health service use and birth outcomes. A decision analytic model was developed to evaluate the cost-effectiveness of the programme, quantify the decision uncertainty, and estimate the expected value of perfect information associated with the decision. The model was used to describe the transitions between health states for women and infants and reflect the change of both costs and health benefits associated with implementing the programme. Result gained from the mixed models and other relevant evidence identified were synthesised appropriately to inform the input parameters of the model. Incremental cost-effectiveness ratios of the programme were calculated for the two groups of intervention counties over time. Uncertainty surrounding the parameters was dealt with using probabilistic sensitivity analysis, and uncertainty relating to model assumptions was handled using scenario analysis. Finally the expected value of perfect information for both the whole model and individual parameters in the model were estimated to inform priority setting in further research in this area.The annual change rates of the antenatal care rate and the institutionalised delivery rate were improved significantly in the intervention counties after the programme was implemented. Significant improvements were also found in the annual change rates of the maternal mortality ratio, the infant mortality rate, the incidence rate of neonatal tetanus and the mortality rate of neonatal tetanus in the intervention counties after the implementation of the programme. The annual change rate of the neonatal mortality rate was also improved, although the improvement was only close to statistical significance. The influences of the socio-economic factors on the health service use indicators and birth outcomes were identified. The rural income per capita had a significant positive impact on the health service use indicators, and a significant negative impact on the birth outcomes. The number of beds in healthcare institutions per 1,000 population and the number of rural telephone subscribers per 1,000 were found to be positively significantly related to the institutionalised delivery rate. The length of highway per square kilometre negatively influenced the maternal mortality ratio. The percentage of employed persons in the primary industry had a significant negative impact on the institutionalised delivery rate, and a significant positive impact on the infant mortality rate and neonatal mortality rate. The incremental costs of implementing the programme over the existing practice were US $11.1 million from the societal perspective, and US $13.8 million from the perspective of the Ministry of Health. Overall, 28,711 life years were generated by the programme, producing an overall incremental cost-effectiveness ratio of US $386 from the societal perspective, and US $480 from the perspective of the Ministry of Health, both of which were below the threshold willingness-to-pay ratio of US $675. The expected net monetary benefit generated by the programme was US $8.3 million from the societal perspective, and US $5.5 million from the perspective of the Ministry of Health. The overall probability that the programme was cost-effective was 0.93 and 0.89 from the two perspectives, respectively. The incremental cost-effectiveness ratio of the programme was insensitive to the different estimates of the three parameters relating to the model assumptions. Further research could be conducted to reduce the uncertainty surrounding the decision, in which the upper limit of investment was US $0.6 million from the societal perspective, and US $1.3 million from the perspective of the Ministry of Health. It is also worthwhile to get a more precise estimate of the improvement of infant mortality rate. The population expected value of perfect information for individual parameters associated with this parameter was US $0.99 million from the societal perspective, and US $1.14 million from the perspective of the Ministry of Health. The findings from this study have shown that the interventions in the Safe Motherhood Programme were both effective and cost-effective in increasing health service use and improving birth outcomes in rural areas of Guangxi, China. Therefore, the programme represents a good public health investment and should be adopted and further expanded to an even broader area if possible. This research provides economic evidence to inform efficient decision making in improving maternal and infant health in developing countries.
Resumo:
Background. Interventions that prevent healthcare-associated infection should lead to fewer deaths and shorter hospital stays. Cleaning hands (with soap or alcohol) is an effective way to prevent the transmission of organisms, but rates of compliance with hand hygiene are sometimes disappointingly low. The National Hand Hygiene Initiative in Australia aimed to improve hand hygiene compliance among healthcare workers, with the goal of reducing rates of healthcare-associated infection. Methods. We examined whether the introduction of the National Hand Hygiene Initiative was associated with a change in infection rates. Monthly infection rates for healthcare-associated Staphylococcus aureus bloodstream infections were examined in 38 Australian hospitals across 6 states. We used Poisson regression and examined 12 possible patterns of change, with the best fitting pattern chosen using the Akaike information criterion. Monthly bed-days were included to control for increased hospital use over time. Results. The National Hand Hygiene Initiative was associated with a reduction in infection rates in 4 of the 6 states studied. Two states showed an immediate reduction in rates of 17% and 28%, 2 states showed a linear decrease in rates of 8% and 11% per year, and 2 showed no change in infection rates. Conclusions. The intervention was associated with reduced infection rates in most states. The failure in 2 states may have been because those states already had effective initiatives before the national initiative’s introduction or because infection rates were already low and could not be further reduced.
Resumo:
Research problem: Overfitting and collinearity problems commonly exist in current construction cost estimation applications and obstruct researchers and practitioners in achieving better modelling results. Research objective and method: A hybrid approach of Akaike information criterion (AIC) stepwise regression and principal component regression (PCR) is proposed to help solve overfitting and collinearity problems. Utilization of this approach in linear regression is validated by comparing it with other commonly used approaches. The mean square error obtained by leave-one-out cross validation (MSELOOCV) is used in model selection in deciding predictive variables.
Resumo:
Aim Large-scale patterns linking energy availability, biological productivity and diversity form a central focus of ecology. Despite evidence that the activity and abundance of animals may be limited by climatic variables associated with regional biological productivity (e.g. mean annual precipitation and annual actual evapotranspiration), it is unclear whether plant–granivore interactions are themselves influenced by these climatic factors across broad spatial extents. We evaluated whether climatic conditions that are known to alter the abundance and activity of granivorous animals also affect rates of seed removal. Location Eleven sites across temperate North America. Methods We used a common protocol to assess the removal of the same seed species (Avena sativa) over a 2-day period. Model selection via the Akaike information criterion was used to determine a set of candidate binomial generalized linear mixed models that evaluated the relationship between local climatic data and post-dispersal seed predation. Results Annual actual evapotranspiration was the single best predictor of the proportion of seeds removed. Annual actual evapotranspiration and mean annual precipitation were both positively related to mean seed removal and were included in four and three of the top five models, respectively. Annual temperature range was also positively related to seed removal and was an explanatory variable in three of the top four models. Main conclusions Our work provides the first evidence that energy and precipitation, which are known to affect consumer abundance and activity, also translate to strong, predictable patterns of seed predation across a continent. More generally, these findings suggest that future changes in temperature and precipitation could have widespread consequences for plant species composition in grasslands, through impacts on plant recruitment.
Resumo:
Spatial data analysis has become more and more important in the studies of ecology and economics during the last decade. One focus of spatial data analysis is how to select predictors, variance functions and correlation functions. However, in general, the true covariance function is unknown and the working covariance structure is often misspecified. In this paper, our target is to find a good strategy to identify the best model from the candidate set using model selection criteria. This paper is to evaluate the ability of some information criteria (corrected Akaike information criterion, Bayesian information criterion (BIC) and residual information criterion (RIC)) for choosing the optimal model when the working correlation function, the working variance function and the working mean function are correct or misspecified. Simulations are carried out for small to moderate sample sizes. Four candidate covariance functions (exponential, Gaussian, Matern and rational quadratic) are used in simulation studies. With the summary in simulation results, we find that the misspecified working correlation structure can still capture some spatial correlation information in model fitting. When the sample size is large enough, BIC and RIC perform well even if the the working covariance is misspecified. Moreover, the performance of these information criteria is related to the average level of model fitting which can be indicated by the average adjusted R square ( [GRAPHICS] ), and overall RIC performs well.
Resumo:
The stress release model, a stochastic version of the elastic rebound theory, is applied to the large events from four synthetic earthquake catalogs generated by models with various levels of disorder in distribution of fault zone strength (Ben-Zion, 1996) They include models with uniform properties (U), a Parkfield-type asperity (A), fractal brittle properties (F), and multi-size-scale heterogeneities (M). The results show that the degree of regularity or predictability in the assumed fault properties, based on both the Akaike information criterion and simulations, follows the order U, F, A, and M, which is in good agreement with that obtained by pattern recognition techniques applied to the full set of synthetic data. Data simulated from the best fitting stress release models reproduce, both visually and in distributional terms, the main features of the original catalogs. The differences in character and the quality of prediction between the four cases are shown to be dependent on two main aspects: the parameter controlling the sensitivity to departures from the mean stress level and the frequency-magnitude distribution, which differs substantially between the four cases. In particular, it is shown that the predictability of the data is strongly affected by the form of frequency-magnitude distribution, being greatly reduced if a pure Gutenburg-Richter form is assumed to hold out to high magnitudes.