102 resultados para R-package
Resumo:
Microsatellite markers have demonstrated their value for performing paternity exclusion and hence exploring mating patterns in plants and animals. Methodology is well established for diploid species, and several software packages exist for elucidating paternity in diploids; however, these issues are not so readily addressed in polyploids due to the increased complexity of the exclusion problem and a lack of available software. We introduce polypatex, an r package for paternity exclusion analysis using microsatellite data in autopolyploid, monoecious or dioecious/bisexual species with a ploidy of 4n, 6n or 8n. Given marker data for a set of offspring, their mothers and a set of candidate fathers, polypatex uses allele matching to exclude candidates whose marker alleles are incompatible with the alleles in each offspring–mother pair. polypatex can analyse marker data sets in which allele copy numbers are known (genotype data) or unknown (allelic phenotype data) – for data sets in which allele copy numbers are unknown, comparisons are made taking into account all possible genotypes that could arise from the compared allele sets. polypatex is a software tool that provides population geneticists with the ability to investigate the mating patterns of autopolyploids using paternity exclusion analysis on data from codominant markers having multiple alleles per locus.
Resumo:
Seasonal patterns have been found in a remarkable range of health conditions, including birth defects, respiratory infections and cardiovascular disease. Accurately estimating the size and timing of seasonal peaks in disease incidence is an aid to understanding the causes and possibly to developing interventions. With global warming increasing the intensity of seasonal weather patterns around the world, a review of the methods for estimating seasonal effects on health is timely. This is the first book on statistical methods for seasonal data written for a health audience. It describes methods for a range of outcomes (including continuous, count and binomial data) and demonstrates appropriate techniques for summarising and modelling these data. It has a practical focus and uses interesting examples to motivate and illustrate the methods. The statistical procedures and example data sets are available in an R package called ‘season’. Adrian Barnett is a senior research fellow at Queensland University of Technology, Australia. Annette Dobson is a Professor of Biostatistics at The University of Queensland, Australia. Both are experienced medical statisticians with a commitment to statistical education and have previously collaborated in research in the methodological developments and applications of biostatistics, especially to time series data. Among other projects, they worked together on revising the well-known textbook "An Introduction to Generalized Linear Models," third edition, Chapman Hall/CRC, 2008. In their new book they share their knowledge of statistical methods for examining seasonal patterns in health.
Resumo:
Background Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset. Results We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours. Conclusions We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers
Resumo:
Structural equation modeling (SEM) is a powerful statistical approach for the testing of networks of direct and indirect theoretical causal relationships in complex data sets with intercorrelated dependent and independent variables. SEM is commonly applied in ecology, but the spatial information commonly found in ecological data remains difficult to model in a SEM framework. Here we propose a simple method for spatially explicit SEM (SE-SEM) based on the analysis of variance/covariance matrices calculated across a range of lag distances. This method provides readily interpretable plots of the change in path coefficients across scale and can be implemented using any standard SEM software package. We demonstrate the application of this method using three studies examining the relationships between environmental factors, plant community structure, nitrogen fixation, and plant competition. By design, these data sets had a spatial component, but were previously analyzed using standard SEM models. Using these data sets, we demonstrate the application of SE-SEM to regularly spaced, irregularly spaced, and ad hoc spatial sampling designs and discuss the increased inferential capability of this approach compared with standard SEM. We provide an R package, sesem, to easily implement spatial structural equation modeling.
Resumo:
This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Resumo:
Markov chain Monte Carlo (MCMC) estimation provides a solution to the complex integration problems that are faced in the Bayesian analysis of statistical problems. The implementation of MCMC algorithms is, however, code intensive and time consuming. We have developed a Python package, which is called PyMCMC, that aids in the construction of MCMC samplers and helps to substantially reduce the likelihood of coding error, as well as aid in the minimisation of repetitive code. PyMCMC contains classes for Gibbs, Metropolis Hastings, independent Metropolis Hastings, random walk Metropolis Hastings, orientational bias Monte Carlo and slice samplers as well as specific modules for common models such as a module for Bayesian regression analysis. PyMCMC is straightforward to optimise, taking advantage of the Python libraries Numpy and Scipy, as well as being readily extensible with C or Fortran.
Resumo:
The R statistical environment and language has demonstrated particular strengths for interactive development of statistical algorithms, as well as data modelling and visualisation. Its current implementation has an interpreter at its core which may result in a performance penalty in comparison to directly executing user algorithms in the native machine code of the host CPU. In contrast, the C++ language has no built-in visualisation capabilities, handling of linear algebra or even basic statistical algorithms; however, user programs are converted to high-performance machine code, ahead of execution. A new method avoids possible speed penalties in R by using the Rcpp extension package in conjunction with the Armadillo C++ matrix library. In addition to the inherent performance advantages of compiled code, Armadillo provides an easy-to-use template-based meta-programming framework, allowing the automatic pooling of several linear algebra operations into one, which in turn can lead to further speedups. With the aid of Rcpp and Armadillo, conversion of linear algebra centered algorithms from R to C++ becomes straightforward. The algorithms retains the overall structure as well as readability, all while maintaining a bidirectional link with the host R environment. Empirical timing comparisons of R and C++ implementations of a Kalman filtering algorithm indicate a speedup of several orders of magnitude.
Resumo:
Introduction: The delivery of health care in the 21st century will look like no other in the past. The fast paced technological advances that are being made will need to transition from the information age into clinical practice. The phenomenon of e-Health is the over-arching form of information technology and telehealth is one arm of that phenomenon. The uptake of telehealth both in Australia and overseas, has changed the face of health service delivery to many rural and remote communities for the better, removing what is known as the tyranny of distance. Many studies have evaluated the satisfaction and cost-benefit analysis of telehealth across the organisational aspects as well as the various adaptations of clinical pathways and this is the predominant focus of most studies published to date. However, whilst comments have been made by many researchers about the need to improve and attend to the communication and relationship building aspects of telehealth no studies have examined this further. The aim of this study was to identify the patient and clinician experiences, concerns, behaviours and perceptions of the telehealth interaction and develop a training tool to assist these clinicians to improve their interaction skills. Methods: A mixed methods design combining quantitative (survey analysis and data coding) and qualitative (interview analysis) approaches was adopted. This study utilised four phases to firstly qualitatively explore the needs of clients (patients) and clinicians within a telehealth consultation then designed, developed, piloted and quantitatively and qualitatively evaluated the telehealth communication training program. Qualitative data was collected and analysed during Phase 1 of this study to describe and define the missing 'communication and rapport building' aspects within telehealth. This data was then utilised to develop a self-paced communication training program that enhanced clinicians existing skills, which comprised of Phase 2 of this study to develop the interactive program. Phase 3 included evaluating the training program with 26 clinicians and results were recorded pre and post training, whilst phase 4 was the pilot for future recommendations of this training program using a patient group within a Queensland Health setting at two rural hospitals. Results: Comparisons of pre and post training data on 1) Effective communication styles, 2) Involvement in communication training package, 3) satisfaction pre and post training, and 4) health outcomes pre and post training indicated that there were differences between pre and post training in relation to effective communication style, increased satisfaction and no difference in health outcomes between pre and post training for this patient group. The post training results revealed over half of the participants (N= 17, 65%) were more responsive to non-verbal cues and were better able to reflect and respond to looks of anxiousness and confusion from a 'patient' within a telehealth consultation. It was also found that during post training evaluations, clinicians had enhanced their therapeutic communication with greater detail to their own body postures, eye contact and presentation. There was greater time spent looking at the 'patient' with an increase of 35 second intervals of direct eye contact and less time spent looking down at paperwork which decreased by 20 seconds. Overall 73% of the clinicians were satisfied with the training program and 61% strongly agreed that they recognised areas of their communication that needed improving during a telehealth consultation. For the patient group there was significant difference post training in rapport with a mean score from 42 (SD = 28, n = 27) to 48 (SD = 5.9, n = 24). For communication comfort of the patient group there was a significant difference between the pre and post training scores t(10) = 27.9, p = .002, which meant that overall the patients felt less inhibited whilst talking to the clinicians and more understood. Conclusion: The aim of this study was to explore the characteristics of good patient-clinician communication and unmet training needs for telehealth consultations. The study developed a training program that was specific for telehealth consultations and not dependent on a 'trainer' to deliver the content. In light of the existing literature this is a first of its kind and a valuable contribution to the research on this topic. It was found that the training program was effective in improving the clinician's communication style and increased the satisfaction of patient's within an e-health environment. This study has identified some historical myths that telehealth cannot be part of empathic patient centred care due to its technology tag.
Resumo:
Industrial employment growth has been one of the most dynamic areas of expansion in Asia; however, current trends in industrialised working environments have resulted in greater employee stress. Despite research showing that cultural values affect the way people cope with stress, there is a dearth of psychometrically established tools for use in non-Western countries to measure these constructs. Studies of the "Way of Coping Checklist-Revised" (WCCL-R) in the West suggest that the WCCL-R has good psychometric properties, but its applicability in the East is still understudied. A confirmatory factor analysis (CFA) is used to validate the WCCL-R constructs in an Asian population. This study used 1,314 participants from Indonesia, Sri Lanka, Singapore, and Thailand. An initial exploratory factor analysis revealed that original structures were not confirmed; however, a subsequent EFA and CFA showed that a 38-item, five-factor structure model was confirmed. The revised WCCL-R in the Asian sample was also found to have good reliability and sound construct and concurrent validity. The 38-item structure of the WCCL-R has considerable potential in future occupational stress-related research in Asian countries.
Resumo:
The current study aims to investigate the non-linear relationship between the JD-R model and work engagement. Previous research has identified linear relationships between these constructs; however there are strong theoretical arguments for testing curvilinear relationships (e.g., Warr, 1987). Data were collected via a self-report online survey from officers of one Australian police service (N = 2,626). Results demonstrated a curvilinear relationship between job demands and job resources and engagement. Gender (as a control variable) was also found to be a significant predictor of work engagement. The results indicated that male police officers experienced significantly higher job demands and colleague support than female officers. However, female police officers reported significantly higher levels of work engagement than male officers. This study emphasises the need to test curvilinear relationships, as well as simple linear associations, when measuring psychological health.
Resumo:
Significant sums of money are invested in developing technological innovations that have low levels and rates of adoption. Several approaches have been put forward in an effort to improve rates of adoption. This paper presents the results of study that examined the innovation fit of key technological innovations in the beef industry. Findings indicate that be assessing the innovation fit throughout the R&D process researchers and end users can collaborate to improve the innovation fit and the rate of adoption. The paper also put forward a model that demonstrates the linkages between R&D, adoption and innovation fit.
Resumo:
This book is a thorough investigation of the relationship between land use planning and the railways in Britain, through review of the factors affecting the two sectors and their integration during the period of public ownership. The rationale behind the book is explained as a timely analysis of the dynamic correlation involving town planning and management of the railway in a period when growing congestion on the road network is forcing people to look for alternative modes and capacity is badly needed to accommodate this increased demand for travel. The book calls for a modal shift from road to rail for passenger and freight traffic.