914 resultados para statistical data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complexity in time series is an intriguing feature of living dynamical systems, with potential use for identification of system state. Although various methods have been proposed for measuring physiologic complexity, uncorrelated time series are often assigned high values of complexity, errouneously classifying them as a complex physiological signals. Here, we propose and discuss a method for complex system analysis based on generalized statistical formalism and surrogate time series. Sample entropy (SampEn) was rewritten inspired in Tsallis generalized entropy, as function of q parameter (qSampEn). qSDiff curves were calculated, which consist of differences between original and surrogate series qSampEn. We evaluated qSDiff for 125 real heart rate variability (HRV) dynamics, divided into groups of 70 healthy, 44 congestive heart failure (CHF), and 11 atrial fibrillation (AF) subjects, and for simulated series of stochastic and chaotic process. The evaluations showed that, for nonperiodic signals, qSDiff curves have a maximum point (qSDiff(max)) for q not equal 1. Values of q where the maximum point occurs and where qSDiff is zero were also evaluated. Only qSDiff(max) values were capable of distinguish HRV groups (p-values 5.10 x 10(-3); 1.11 x 10(-7), and 5.50 x 10(-7) for healthy vs. CHF, healthy vs. AF, and CHF vs. AF, respectively), consistently with the concept of physiologic complexity, and suggests a potential use for chaotic system analysis. (C) 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.4758815]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical shape analysis techniques commonly employed in the medical imaging community, such as active shape models or active appearance models, rely on principal component analysis (PCA) to decompose shape variability into a reduced set of interpretable components. In this paper we propose principal factor analysis (PFA) as an alternative and complementary tool to PCA providing a decomposition into modes of variation that can be more easily interpretable, while still being a linear efficient technique that performs dimensionality reduction (as opposed to independent component analysis, ICA). The key difference between PFA and PCA is that PFA models covariance between variables, rather than the total variance in the data. The added value of PFA is illustrated on 2D landmark data of corpora callosa outlines. Then, a study of the 3D shape variability of the human left femur is performed. Finally, we report results on vector-valued 3D deformation fields resulting from non-rigid registration of ventricles in MRI of the brain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vietnam has developed rapidly over the past 15 years. However, progress was not uniformly distributed across the country. Availability, adequate visualization and analysis of spatially explicit data on socio-economic and environmental aspects can support both research and policy towards sustainable development. Applying appropriate mapping techniques allows gleaning important information from tabular socio-economic data. Spatial analysis of socio-economic phenomena can yield insights into locally-specifi c patterns and processes that cannot be generated by non-spatial applications. This paper presents techniques and applications that develop and analyze spatially highly disaggregated socioeconomic datasets. A number of examples show how such information can support informed decisionmaking and research in Vietnam.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most statistical analysis, theory and practice, is concerned with static models; models with a proposed set of parameters whose values are fixed across observational units. Static models implicitly assume that the quantified relationships remain the same across the design space of the data. While this is reasonable under many circumstances this can be a dangerous assumption when dealing with sequentially ordered data. The mere passage of time always brings fresh considerations and the interrelationships among parameters, or subsets of parameters, may need to be continually revised. ^ When data are gathered sequentially dynamic interim monitoring may be useful as new subject-specific parameters are introduced with each new observational unit. Sequential imputation via dynamic hierarchical models is an efficient strategy for handling missing data and analyzing longitudinal studies. Dynamic conditional independence models offers a flexible framework that exploits the Bayesian updating scheme for capturing the evolution of both the population and individual effects over time. While static models often describe aggregate information well they often do not reflect conflicts in the information at the individual level. Dynamic models prove advantageous over static models in capturing both individual and aggregate trends. Computations for such models can be carried out via the Gibbs sampler. An application using a small sample repeated measures normally distributed growth curve data is presented. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aims: The reported rate of stent thrombosis (ST) after drug-eluting stent (DES) implantation varies among registries. To investigate differences in baseline characteristics and clinical outcome in European and Japanese all-comers registries, we performed a pooled analysis of patient-level data. Methods and results: The j-Cypher registry (JC) is a multicentre observational study conducted in Japan, including 12,824 patients undergoing SES implantation. From the Bern-Rotterdam registry (BR) enrolled at two academic hospitals in Switzerland and the Netherlands, 3,823 patients with SES were included in the current analysis. Patients in BR were younger, more frequently smokers and presented more frequently with ST-elevation myocardial infarction (MI). Conversely, JC patients more frequently had diabetes and hypertension. At five years, the definite ST rate was significantly lower in JC than BR (JC 1.6% vs. BR 3.3%, p<0.001), while the unadjusted mortality tended to be lower in BR than in JC (BR 13.2% vs. JC 14.4%, log-rank p=0.052). After adjustment, the j-Cypher registry was associated with a significantly lower risk of all-cause mortality (HR 0.56, 95% CI: 0.49-0.64) as well as definite stent thrombosis (HR 0.46, 95% CI: 0.35-0.61). Conclusions: The baseline characteristics of the two large registries were different. After statistical adjustment, JC was associated with lower mortality and ST.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Protein-energy-malnutrition (PEM) is common in people with end stage kidney disease (ESKD) undergoing maintenance haemodialysis (MHD) and correlates strongly with mortality. To this day, there is no gold standard for detecting PEM in patients on MHD. Aim of Study The aim of this study was to evaluate if Nutritional Risk Screening 2002 (NRS-2002), handgrip strength measurement, mid-upper arm muscle area (MUAMA), triceps skin fold measurement (TSF), serum albumin, normalised protein catabolic rate (nPCR), Kt/V and eKt/V, dry body weight, body mass index (BMI), age and time since start on MHD are relevant for assessing PEM in patients on MHD. Methods The predictive value of the selected parameters on mortality and mortality or weight loss of more than 5% was assessed. Quantitative data analysis of the 12 parameters in the same patients on MHD in autumn 2009 (n = 64) and spring 2011 (n = 40) with paired statistical analysis and multivariate logistic regression analysis was performed. Results Paired data analysis showed significant reduction of dry body weight, BMI and nPCR. Kt/Vtot did not change, eKt/v and hand grip strength measurements were significantly higher in spring 2011. No changes were detected in TSF, serum albumin, NRS-2002 and MUAMA. Serum albumin was shown to be the only predictor of death and of the combined endpoint “death or weight loss of more than 5%”. Conclusion We now screen patients biannually for serum albumin, nPCR, Kt/V, handgrip measurement of the shunt-free arm, dry body weight, age and time since initiation of MHD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction. Despite the ban of lead-containing gasoline and paint, childhood lead poisoning remains a public health issue. Furthermore, a Medicaid-eligible child is 8 times more likely to have an elevated blood lead level (EBLL) than a non-Medicaid child, which is the primary reason for the early detection lead screening mandate for ages 12 and 24 months among the Medicaid population. Based on field observations, there was evidence that suggested a screening compliance issue. Objective. The purpose of this study was to analyze blood lead screening compliance in previously lead poisoned Medicaid children and test for an association between timely lead screening and timely childhood immunizations. The mean months between follow-up tests were also examined for a significant difference between the non-compliant and compliant lead screened children. Methods. Access to the surveillance data of all childhood lead poisoned cases in Bexar County was granted by the San Antonio Metropolitan Health District. A database was constructed and analyzed using descriptive statistics, logistic regression methods and non-parametric tests. Lead screening at 12 months of age was analyzed separately from lead screening at 24 months. The small portion of the population who were also related were included in one analysis and removed from a second analysis to check for significance. Gender, ethnicity, age of home, and having a sibling with an EBLL were ruled out as confounders for the association tests but ethnicity and age of home were adjusted in the nonparametric tests. Results. There was a strong significant association between lead screening compliance at 12 months and childhood immunization compliance, with or without including related children (p<0.00). However, there was no significant association between the two variables at the age of 24 months. Furthermore, there was no significant difference between the median of the mean months of follow-up blood tests among the non-compliant and compliant lead screened population for at the 12 month screening group but there was a significant difference at the 24 month screening group (p<0.01). Discussion. Descriptive statistics showed that 61% and 56% of the previously lead poisoned Medicaid population did not receive their 12 and 24 month mandated lead screening on time, respectively. This suggests that their elevated blood lead level may have been diagnosed earlier in their childhood. Furthermore, a child who is compliant with their lead screening at 12 months of age is 2.36 times more likely to also receive their childhood immunizations on time compared to a child who was not compliant with their 12 month screening. Even though there was no statistical significant association found for the 24 month group, the public health significance of a screening compliance issue is no less important. The Texas Medicaid program needs to enforce lead screening compliance because it is evident that there has been no monitoring system in place. Further recommendations include a need for an increased focus on parental education and the importance of taking their children for wellness exams on time.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this paper is to propose a mathematical model to determine invariant sets, set covering, orbits and, in particular, attractors in the set of tourism variables. Analysis was carried out based on a pre-designed algorithm and applying our interpretation of chaos theory developed in the context of General Systems Theory. This article sets out the causal relationships associated with tourist flows in order to enable the formulation of appropriate strategies. Our results can be applied to numerous cases. For example, in the analysis of tourist flows, these findings can be used to determine whether the behaviour of certain groups affects that of other groups and to analyse tourist behaviour in terms of the most relevant variables. Unlike statistical analyses that merely provide information on current data, our method uses orbit analysis to forecast, if attractors are found, the behaviour of tourist variables in the immediate future.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"In any comprehensive research project, there are essentially five steps. First, one starts with a literature review with regard to a particular research question. Second, one seeks to develop a theory. Third, the research question is finalized, frequently in the form of a hypothesis to be tested. Fourth, data are collected. Fifth, the subject matter of this paper, the data are analyzed in order to come to a resolution of the research question. There are two general approaches to analyzing research data. If the data were gathered concerning a 'research question,' a description of the data may be sufficient. However, if the data were gathered to accept or reject a formal hypothesis, statistical analysis is usually in order. This paper briefly surveys the principal data analysis methodologies that are available."

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transportation Department, Office of Transportation Systems Analysis and Information, Washington, D.C.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article explains first, the reasons why a knowledge of statistics is necessary and describes the role that statistics plays in an experimental investigation. Second, the normal distribution is introduced which describes the natural variability shown by many measurements in optometry and vision sciences. Third, the application of the normal distribution to some common statistical problems including how to determine whether an individual observation is a typical member of a population and how to determine the confidence interval for a sample mean is described.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this second article, statistical ideas are extended to the problem of testing whether there is a true difference between two samples of measurements. First, it will be shown that the difference between the means of two samples comes from a population of such differences which is normally distributed. Second, the 't' distribution, one of the most important in statistics, will be applied to a test of the difference between two means using a simple data set drawn from a clinical experiment in optometry. Third, in making a t-test, a statistical judgement is made as to whether there is a significant difference between the means of two samples. Before the widespread use of statistical software, this judgement was made with reference to a statistical table. Even if such tables are not used, it is useful to understand their logical structure and how to use them. Finally, the analysis of data, which are known to depart significantly from the normal distribution, will be described.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feed-forward neural network is utilised to effect a topographic, structure-preserving, dimension-reducing transformation of the data, with an additional facility to incorporate different degrees of associated subjective information. The properties of this transformation are illustrated on synthetic and real datasets, including the 1992 UK Research Assessment Exercise for funding in higher education. The method is compared and contrasted to established techniques for feature extraction, and related to topographic mappings, the Sammon projection and the statistical field of multidimensional scaling.