938 resultados para Page Rank
Resumo:
BACKGROUND Measuring disease and injury burden in populations requires a composite metric that captures both premature mortality and the prevalence and severity of ill-health. The 1990 Global Burden of Disease study proposed disability-adjusted life years (DALYs) to measure disease burden. No comprehensive update of disease burden worldwide incorporating a systematic reassessment of disease and injury-specific epidemiology has been done since the 1990 study. We aimed to calculate disease burden worldwide and for 21 regions for 1990, 2005, and 2010 with methods to enable meaningful comparisons over time. METHODS We calculated DALYs as the sum of years of life lost (YLLs) and years lived with disability (YLDs). DALYs were calculated for 291 causes, 20 age groups, both sexes, and for 187 countries, and aggregated to regional and global estimates of disease burden for three points in time with strictly comparable definitions and methods. YLLs were calculated from age-sex-country-time-specific estimates of mortality by cause, with death by standardised lost life expectancy at each age. YLDs were calculated as prevalence of 1160 disabling sequelae, by age, sex, and cause, and weighted by new disability weights for each health state. Neither YLLs nor YLDs were age-weighted or discounted. Uncertainty around cause-specific DALYs was calculated incorporating uncertainty in levels of all-cause mortality, cause-specific mortality, prevalence, and disability weights. FINDINGS Global DALYs remained stable from 1990 (2·503 billion) to 2010 (2·490 billion). Crude DALYs per 1000 decreased by 23% (472 per 1000 to 361 per 1000). An important shift has occurred in DALY composition with the contribution of deaths and disability among children (younger than 5 years of age) declining from 41% of global DALYs in 1990 to 25% in 2010. YLLs typically account for about half of disease burden in more developed regions (high-income Asia Pacific, western Europe, high-income North America, and Australasia), rising to over 80% of DALYs in sub-Saharan Africa. In 1990, 47% of DALYs worldwide were from communicable, maternal, neonatal, and nutritional disorders, 43% from non-communicable diseases, and 10% from injuries. By 2010, this had shifted to 35%, 54%, and 11%, respectively. Ischaemic heart disease was the leading cause of DALYs worldwide in 2010 (up from fourth rank in 1990, increasing by 29%), followed by lower respiratory infections (top rank in 1990; 44% decline in DALYs), stroke (fifth in 1990; 19% increase), diarrhoeal diseases (second in 1990; 51% decrease), and HIV/AIDS (33rd in 1990; 351% increase). Major depressive disorder increased from 15th to 11th rank (37% increase) and road injury from 12th to 10th rank (34% increase). Substantial heterogeneity exists in rankings of leading causes of disease burden among regions. INTERPRETATION Global disease burden has continued to shift away from communicable to non-communicable diseases and from premature death to years lived with disability. In sub-Saharan Africa, however, many communicable, maternal, neonatal, and nutritional disorders remain the dominant causes of disease burden. The rising burden from mental and behavioural disorders, musculoskeletal disorders, and diabetes will impose new challenges on health systems. Regional heterogeneity highlights the importance of understanding local burden of disease and setting goals and targets for the post-2015 agenda taking such patterns into account. Because of improved definitions, methods, and data, these results for 1990 and 2010 supersede all previously published Global Burden of Disease results.
Resumo:
Discounted Cumulative Gain (DCG) is a well-known ranking evaluation measure for models built with multiple relevance graded data. By handling tagging data used in recommendation systems as an ordinal relevance set of {negative,null,positive}, we propose to build a DCG based recommendation model. We present an efficient and novel learning-to-rank method by optimizing DCG for a recommendation model using the tagging data interpretation scheme. Evaluating the proposed method on real-world datasets, we demonstrate that the method is scalable and outperforms the benchmarking methods by generating a quality top-N item recommendation list.
Resumo:
BACKGROUND Quantification of the disease burden caused by different risks informs prevention by providing an account of health loss different to that provided by a disease-by-disease analysis. No complete revision of global disease burden caused by risk factors has been done since a comparative risk assessment in 2000, and no previous analysis has assessed changes in burden attributable to risk factors over time. METHODS We estimated deaths and disability-adjusted life years (DALYs; sum of years lived with disability [YLD] and years of life lost [YLL]) attributable to the independent effects of 67 risk factors and clusters of risk factors for 21 regions in 1990 and 2010. We estimated exposure distributions for each year, region, sex, and age group, and relative risks per unit of exposure by systematically reviewing and synthesising published and unpublished data. We used these estimates, together with estimates of cause-specific deaths and DALYs from the Global Burden of Disease Study 2010, to calculate the burden attributable to each risk factor exposure compared with the theoretical-minimum-risk exposure. We incorporated uncertainty in disease burden, relative risks, and exposures into our estimates of attributable burden. FINDINGS In 2010, the three leading risk factors for global disease burden were high blood pressure (7·0% [95% uncertainty interval 6·2-7·7] of global DALYs), tobacco smoking including second-hand smoke (6·3% [5·5-7·0]), and alcohol use (5·5% [5·0-5·9]). In 1990, the leading risks were childhood underweight (7·9% [6·8-9·4]), household air pollution from solid fuels (HAP; 7·0% [5·6-8·3]), and tobacco smoking including second-hand smoke (6·1% [5·4-6·8]). Dietary risk factors and physical inactivity collectively accounted for 10·0% (95% UI 9·2-10·8) of global DALYs in 2010, with the most prominent dietary risks being diets low in fruits and those high in sodium. Several risks that primarily affect childhood communicable diseases, including unimproved water and sanitation and childhood micronutrient deficiencies, fell in rank between 1990 and 2010, with unimproved water and sanitation accounting for 0·9% (0·4-1·6) of global DALYs in 2010. However, in most of sub-Saharan Africa childhood underweight, HAP, and non-exclusive and discontinued breastfeeding were the leading risks in 2010, while HAP was the leading risk in south Asia. The leading risk factor in Eastern Europe, most of Latin America, and southern sub-Saharan Africa in 2010 was alcohol use; in most of Asia, North Africa and Middle East, and central Europe it was high blood pressure. Despite declines, tobacco smoking including second-hand smoke remained the leading risk in high-income north America and western Europe. High body-mass index has increased globally and it is the leading risk in Australasia and southern Latin America, and also ranks high in other high-income regions, North Africa and Middle East, and Oceania. INTERPRETATION Worldwide, the contribution of different risk factors to disease burden has changed substantially, with a shift away from risks for communicable diseases in children towards those for non-communicable diseases in adults. These changes are related to the ageing population, decreased mortality among children younger than 5 years, changes in cause-of-death composition, and changes in risk factor exposures. New evidence has led to changes in the magnitude of key risks including unimproved water and sanitation, vitamin A and zinc deficiencies, and ambient particulate matter pollution. The extent to which the epidemiological shift has occurred and what the leading risks currently are varies greatly across regions. In much of sub-Saharan Africa, the leading risks are still those associated with poverty and those that affect children.
Resumo:
To classify each stage for a progressing disease such as Alzheimer’s disease is a key issue for the disease prevention and treatment. In this study, we derived structural brain networks from diffusion-weighted MRI using whole-brain tractography since there is growing interest in relating connectivity measures to clinical, cognitive, and genetic data. Relatively little work has usedmachine learning to make inferences about variations in brain networks in the progression of the Alzheimer’s disease. Here we developed a framework to utilize generalized low rank approximations of matrices (GLRAM) and modified linear discrimination analysis for unsupervised feature learning and classification of connectivity matrices. We apply the methods to brain networks derived from DWI scans of 41 people with Alzheimer’s disease, 73 people with EMCI, 38 people with LMCI, 47 elderly healthy controls and 221 young healthy controls. Our results show that this new framework can significantly improve classification accuracy when combining multiple datasets; this suggests the value of using data beyond the classification task at hand to model variations in brain connectivity.
Resumo:
A smoothed rank-based procedure is developed for the accelerated failure time model to overcome computational issues. The proposed estimator is based on an EM-type procedure coupled with the induced smoothing. "The proposed iterative approach converges provided the initial value is based on a consistent estimator, and the limiting covariance matrix can be obtained from a sandwich-type formula. The consistency and asymptotic normality of the proposed estimator are also established. Extensive simulations show that the new estimator is not only computationally less demanding but also more reliable than the other existing estimators.
Resumo:
Ordinal qualitative data are often collected for phenotypical measurements in plant pathology and other biological sciences. Statistical methods, such as t tests or analysis of variance, are usually used to analyze ordinal data when comparing two groups or multiple groups. However, the underlying assumptions such as normality and homogeneous variances are often violated for qualitative data. To this end, we investigated an alternative methodology, rank regression, for analyzing the ordinal data. The rank-based methods are essentially based on pairwise comparisons and, therefore, can deal with qualitative data naturally. They require neither normality assumption nor data transformation. Apart from robustness against outliers and high efficiency, the rank regression can also incorporate covariate effects in the same way as the ordinary regression. By reanalyzing a data set from a wheat Fusarium crown rot study, we illustrated the use of the rank regression methodology and demonstrated that the rank regression models appear to be more appropriate and sensible for analyzing nonnormal data and data with outliers.
Resumo:
Rank-based inference is widely used because of its robustness. This article provides optimal rank-based estimating functions in analysis of clustered data with random cluster effects. The extensive simulation studies carried out to evaluate the performance of the proposed method demonstrate that it is robust to outliers and is highly efficient given the existence of strong cluster correlations. The performance of the proposed method is satisfactory even when the correlation structure is misspecified, or when heteroscedasticity in variance is present. Finally, a real dataset is analyzed for illustration.
Resumo:
For clustered survival data, the traditional Gehan-type estimator is asymptotically equivalent to using only the between-cluster ranks, and the within-cluster ranks are ignored. The contribution of this paper is two fold: - (i) incorporating within-cluster ranks in censored data analysis, and; - (ii) applying the induced smoothing of Brown and Wang (2005, Biometrika) for computational convenience. Asymptotic properties of the resulting estimating functions are given. We also carry out numerical studies to assess the performance of the proposed approach and conclude that the proposed approach can lead to much improved estimators when strong clustering effects exist. A dataset from a litter-matched tumorigenesis experiment is used for illustration.
Resumo:
With growing population and fast urbanization in Australia, it is a challenging task to maintain our water quality. It is essential to develop an appropriate statistical methodology in analyzing water quality data in order to draw valid conclusions and hence provide useful advices in water management. This paper is to develop robust rank-based procedures for analyzing nonnormally distributed data collected over time at different sites. To take account of temporal correlations of the observations within sites, we consider the optimally combined estimating functions proposed by Wang and Zhu (Biometrika, 93:459-464, 2006) which leads to more efficient parameter estimation. Furthermore, we apply the induced smoothing method to reduce the computational burden. Smoothing leads to easy calculation of the parameter estimates and their variance-covariance matrix. Analysis of water quality data from Total Iron and Total Cyanophytes shows the differences between the traditional generalized linear mixed models and rank regression models. Our analysis also demonstrates the advantages of the rank regression models for analyzing nonnormal data.
Resumo:
Environmental data usually include measurements, such as water quality data, which fall below detection limits, because of limitations of the instruments or of certain analytical methods used. The fact that some responses are not detected needs to be properly taken into account in statistical analysis of such data. However, it is well-known that it is challenging to analyze a data set with detection limits, and we often have to rely on the traditional parametric methods or simple imputation methods. Distributional assumptions can lead to biased inference and justification of distributions is often not possible when the data are correlated and there is a large proportion of data below detection limits. The extent of bias is usually unknown. To draw valid conclusions and hence provide useful advice for environmental management authorities, it is essential to develop and apply an appropriate statistical methodology. This paper proposes rank-based procedures for analyzing non-normally distributed data collected at different sites over a period of time in the presence of multiple detection limits. To take account of temporal correlations within each site, we propose an optimal linear combination of estimating functions and apply the induced smoothing method to reduce the computational burden. Finally, we apply the proposed method to the water quality data collected at Susquehanna River Basin in United States of America, which dearly demonstrates the advantages of the rank regression models.
Resumo:
We consider rank regression for clustered data analysis and investigate the induced smoothing method for obtaining the asymptotic covariance matrices of the parameter estimators. We prove that the induced estimating functions are asymptotically unbiased and the resulting estimators are strongly consistent and asymptotically normal. The induced smoothing approach provides an effective way for obtaining asymptotic covariance matrices for between- and within-cluster estimators and for a combined estimator to take account of within-cluster correlations. We also carry out extensive simulation studies to assess the performance of different estimators. The proposed methodology is substantially Much faster in computation and more stable in numerical results than the existing methods. We apply the proposed methodology to a dataset from a randomized clinical trial.
Resumo:
We consider ranked-based regression models for clustered data analysis. A weighted Wilcoxon rank method is proposed to take account of within-cluster correlations and varying cluster sizes. The asymptotic normality of the resulting estimators is established. A method to estimate covariance of the estimators is also given, which can bypass estimation of the density function. Simulation studies are carried out to compare different estimators for a number of scenarios on the correlation structure, presence/absence of outliers and different correlation values. The proposed methods appear to perform well, in particular, the one incorporating the correlation in the weighting achieves the highest efficiency and robustness against misspecification of correlation structure and outliers. A real example is provided for illustration.
Resumo:
We consider rank-based regression models for repeated measures. To account for possible withinsubject correlations, we decompose the total ranks into between- and within-subject ranks and obtain two different estimators based on between- and within-subject ranks. A simple perturbation method is then introduced to generate bootstrap replicates of the estimating functions and the parameter estimates. This provides a convenient way for combining the corresponding two types of estimating function for more efficient estimation.
Resumo:
Adaptions of weighted rank regression to the accelerated failure time model for censored survival data have been successful in yielding asymptotically normal estimates and flexible weighting schemes to increase statistical efficiencies. However, for only one simple weighting scheme, Gehan or Wilcoxon weights, are estimating equations guaranteed to be monotone in parameter components, and even in this case are step functions, requiring the equivalent of linear programming for computation. The lack of smoothness makes standard error or covariance matrix estimation even more difficult. An induced smoothing technique overcame these difficulties in various problems involving monotone but pure jump estimating equations, including conventional rank regression. The present paper applies induced smoothing to the Gehan-Wilcoxon weighted rank regression for the accelerated failure time model, for the more difficult case of survival time data subject to censoring, where the inapplicability of permutation arguments necessitates a new method of estimating null variance of estimating functions. Smooth monotone parameter estimation and rapid, reliable standard error or covariance matrix estimation is obtained.
Resumo:
A 'pseudo-Bayesian' interpretation of standard errors yields a natural induced smoothing of statistical estimating functions. When applied to rank estimation, the lack of smoothness which prevents standard error estimation is remedied. Efficiency and robustness are preserved, while the smoothed estimation has excellent computational properties. In particular, convergence of the iterative equation for standard error is fast, and standard error calculation becomes asymptotically a one-step procedure. This property also extends to covariance matrix calculation for rank estimates in multi-parameter problems. Examples, and some simple explanations, are given.