878 resultados para regression discrete models
Resumo:
This paper uses a correlated multinomial logit model and a Poisson regression model to measure the factors affecting demand for different types of transportation by elderly and disabled people in rural Virginia. The major results are: (a) A paratransit system providing door-to-door service is highly valued by transportation-handicapped people; (b) Taxis are probably a potential but inferior alternative even when subsidized; (c) Buses are a poor alternative, especially in rural areas where distances to bus stops may be long; (d) Making buses handicap-accessible would have a statistically significant but small effect on mode choice; (e) Demand is price inelastic; and (f) The total number of trips taken is insensitive to mode availability and characteristics. These results suggest that transportation-handicapped people take a limited number of trips. Those they do take are in some sense necessary (given the low elasticity with respect to mode price or availability). People will substitute away from relying upon others when appropriate transportation is available, at least to some degree. But such transportation needs to be flexible enough to meet the needs of the people involved.
Resumo:
This paper demonstrates the use of a spreadsheet in exploring non-linear difference equations that describe digital control systems used in radio engineering, communication and computer architecture. These systems, being the focus of intensive studies of mathematicians and engineers over the last 40 years, may exhibit extremely complicated behaviour interpreted in contemporary terms as transition from global asymptotic stability to chaos through period-doubling bifurcations. The authors argue that embedding advanced mathematical ideas in the technological tool enables one to introduce fundamentals of discrete control systems in tertiary curricula without learners having to deal with complex machinery that rigorous mathematical methods of investigation require. In particular, in the appropriately designed spreadsheet environment, one can effectively visualize a qualitative difference in the behviour of systems with different types of non-linear characteristic.
Resumo:
This paper presents an event-based failure model to predict the number of failures that occur in water distribution assets. Often, such models have been based on analysis of historical failure data combined with pipe characteristics and environmental conditions. In this paper weather data have been added to the model to take into account the commonly observed seasonal variation of the failure rate. The theoretical basis of existing logistic regression models is briefly described in this paper, along with the refinements made to the model for inclusion of seasonal variation of weather. The performance of these refinements is tested using data from two Australian water authorities.
Resumo:
A discrete agent-based model on a periodic lattice of arbitrary dimension is considered. Agents move to nearest-neighbor sites by a motility mechanism accounting for general interactions, which may include volume exclusion. The partial differential equation describing the average occupancy of the agent population is derived systematically. A diffusion equation arises for all types of interactions and is nonlinear except for the simplest interactions. In addition, multiple species of interacting subpopulations give rise to an advection-diffusion equation for each subpopulation. This work extends and generalizes previous specific results, providing a construction method for determining the transport coefficients in terms of a single conditional transition probability, which depends on the occupancy of sites in an influence region. These coefficients characterize the diffusion of agents in a crowded environment in biological and physical processes.
Resumo:
Land-use regression (LUR) is a technique that can improve the accuracy of air pollution exposure assessment in epidemiological studies. Most LUR models are developed for single cities, which places limitations on their applicability to other locations. We sought to develop a model to predict nitrogen dioxide (NO2) concentrations with national coverage of Australia by using satellite observations of tropospheric NO2 columns combined with other predictor variables. We used a generalised estimating equation (GEE) model to predict annual and monthly average ambient NO2 concentrations measured by a national monitoring network from 2006 through 2011. The best annual model explained 81% of spatial variation in NO2 (absolute RMS error=1.4 ppb), while the best monthly model explained 76% (absolute RMS error=1.9 ppb). We applied our models to predict NO2 concentrations at the ~350,000 census mesh blocks across the country (a mesh block is the smallest spatial unit in the Australian census). National population-weighted average concentrations ranged from 7.3 ppb (2006) to 6.3 ppb (2011). We found that a simple approach using tropospheric NO2 column data yielded models with slightly better predictive ability than those produced using a more involved approach that required simulation of surface-to-column ratios. The models were capable of capturing within-urban variability in NO2, and offer the ability to estimate ambient NO2 concentrations at monthly and annual time scales across Australia from 2006–2011. We are making our model predictions freely available for research.
Resumo:
A computationally efficient sequential Monte Carlo algorithm is proposed for the sequential design of experiments for the collection of block data described by mixed effects models. The difficulty in applying a sequential Monte Carlo algorithm in such settings is the need to evaluate the observed data likelihood, which is typically intractable for all but linear Gaussian models. To overcome this difficulty, we propose to unbiasedly estimate the likelihood, and perform inference and make decisions based on an exact-approximate algorithm. Two estimates are proposed: using Quasi Monte Carlo methods and using the Laplace approximation with importance sampling. Both of these approaches can be computationally expensive, so we propose exploiting parallel computational architectures to ensure designs can be derived in a timely manner. We also extend our approach to allow for model uncertainty. This research is motivated by important pharmacological studies related to the treatment of critically ill patients.
Resumo:
Wound healing and tumour growth involve collective cell spreading, which is driven by individual motility and proliferation events within a population of cells. Mathematical models are often used to interpret experimental data and to estimate the parameters so that predictions can be made. Existing methods for parameter estimation typically assume that these parameters are constants and often ignore any uncertainty in the estimated values. We use approximate Bayesian computation (ABC) to estimate the cell diffusivity, D, and the cell proliferation rate, λ, from a discrete model of collective cell spreading, and we quantify the uncertainty associated with these estimates using Bayesian inference. We use a detailed experimental data set describing the collective cell spreading of 3T3 fibroblast cells. The ABC analysis is conducted for different combinations of initial cell densities and experimental times in two separate scenarios: (i) where collective cell spreading is driven by cell motility alone, and (ii) where collective cell spreading is driven by combined cell motility and cell proliferation. We find that D can be estimated precisely, with a small coefficient of variation (CV) of 2–6%. Our results indicate that D appears to depend on the experimental time, which is a feature that has been previously overlooked. Assuming that the values of D are the same in both experimental scenarios, we use the information about D from the first experimental scenario to obtain reasonably precise estimates of λ, with a CV between 4 and 12%. Our estimates of D and λ are consistent with previously reported values; however, our method is based on a straightforward measurement of the position of the leading edge whereas previous approaches have involved expensive cell counting techniques. Additional insights gained using a fully Bayesian approach justify the computational cost, especially since it allows us to accommodate information from different experiments in a principled way.
Resumo:
Tumour microenvironment greatly influences the development and metastasis of cancer progression. The development of three dimensional (3D) culture models which mimic that displayed in vivo can improve cancer biology studies and accelerate novel anticancer drug screening. Inspired by a systems biology approach, we have formed 3D in vitro bioengineered tumour angiogenesis microenvironments within a glycosaminoglycan-based hydrogel culture system. This microenvironment model can routinely recreate breast and prostate tumour vascularisation. The multiple cell types cultured within this model were less sensitive to chemotherapy when compared with two dimensional (2D) cultures, and displayed comparative tumour regression to that displayed in vivo. These features highlight the use of our in vitro culture model as a complementary testing platform in conjunction with animal models, addressing key reduction and replacement goals of the future. We anticipate that this biomimetic model will provide a platform for the in-depth analysis of cancer development and the discovery of novel therapeutic targets.
Resumo:
This project constructed virtual plant leaf surfaces from digitised data sets for use in droplet spray models. Digitisation techniques for obtaining data sets for cotton, chenopodium and wheat leaves are discussed and novel algorithms for the reconstruction of the leaves from these three plant species are developed. The reconstructed leaf surfaces are included into agricultural droplet spray models to investigate the effect of the nozzle and spray formulation combination on the proportion of spray retained by the plant. A numerical study of the post-impaction motion of large droplets that have formed on the leaf surface is also considered.
Resumo:
Large multisite efforts (e.g., the ENIGMA Consortium), have shown that neuroimaging traits including tract integrity (from DTI fractional anisotropy, FA) and subcortical volumes (from T1-weighted scans) are highly heritable and promising phenotypes for discovering genetic variants associated with brain structure. However, genetic correlations (rg) among measures from these different modalities for mapping the human genome to the brain remain unknown. Discovering these correlations can help map genetic and neuroanatomical pathways implicated in development and inherited risk for disease. We use structural equation models and a twin design to find rg between pairs of phenotypes extracted from DTI and MRI scans. When controlling for intracranial volume, the caudate as well as related measures from the limbic system - hippocampal volume - showed high rg with the cingulum FA. Using an unrelated sample and a Seemingly Unrelated Regression model for bivariate analysis of this connection, we show that a multivariate GWAS approach may be more promising for genetic discovery than a univariate approach applied to each trait separately.
Resumo:
Background Multilevel and spatial models are being increasingly used to obtain substantive information on area-level inequalities in cancer survival. Multilevel models assume independent geographical areas, whereas spatial models explicitly incorporate geographical correlation, often via a conditional autoregressive prior. However the relative merits of these methods for large population-based studies have not been explored. Using a case-study approach, we report on the implications of using multilevel and spatial survival models to study geographical inequalities in all-cause survival. Methods Multilevel discrete-time and Bayesian spatial survival models were used to study geographical inequalities in all-cause survival for a population-based colorectal cancer cohort of 22,727 cases aged 20–84 years diagnosed during 1997–2007 from Queensland, Australia. Results Both approaches were viable on this large dataset, and produced similar estimates of the fixed effects. After adding area-level covariates, the between-area variability in survival using multilevel discrete-time models was no longer significant. Spatial inequalities in survival were also markedly reduced after adjusting for aggregated area-level covariates. Only the multilevel approach however, provided an estimation of the contribution of geographical variation to the total variation in survival between individual patients. Conclusions With little difference observed between the two approaches in the estimation of fixed effects, multilevel models should be favored if there is a clear hierarchical data structure and measuring the independent impact of individual- and area-level effects on survival differences is of primary interest. Bayesian spatial analyses may be preferred if spatial correlation between areas is important and if the priority is to assess small-area variations in survival and map spatial patterns. Both approaches can be readily fitted to geographically enabled survival data from international settings
Resumo:
Ordinal qualitative data are often collected for phenotypical measurements in plant pathology and other biological sciences. Statistical methods, such as t tests or analysis of variance, are usually used to analyze ordinal data when comparing two groups or multiple groups. However, the underlying assumptions such as normality and homogeneous variances are often violated for qualitative data. To this end, we investigated an alternative methodology, rank regression, for analyzing the ordinal data. The rank-based methods are essentially based on pairwise comparisons and, therefore, can deal with qualitative data naturally. They require neither normality assumption nor data transformation. Apart from robustness against outliers and high efficiency, the rank regression can also incorporate covariate effects in the same way as the ordinary regression. By reanalyzing a data set from a wheat Fusarium crown rot study, we illustrated the use of the rank regression methodology and demonstrated that the rank regression models appear to be more appropriate and sensible for analyzing nonnormal data and data with outliers.
Resumo:
Rank-based inference is widely used because of its robustness. This article provides optimal rank-based estimating functions in analysis of clustered data with random cluster effects. The extensive simulation studies carried out to evaluate the performance of the proposed method demonstrate that it is robust to outliers and is highly efficient given the existence of strong cluster correlations. The performance of the proposed method is satisfactory even when the correlation structure is misspecified, or when heteroscedasticity in variance is present. Finally, a real dataset is analyzed for illustration.
Resumo:
For clustered survival data, the traditional Gehan-type estimator is asymptotically equivalent to using only the between-cluster ranks, and the within-cluster ranks are ignored. The contribution of this paper is two fold: - (i) incorporating within-cluster ranks in censored data analysis, and; - (ii) applying the induced smoothing of Brown and Wang (2005, Biometrika) for computational convenience. Asymptotic properties of the resulting estimating functions are given. We also carry out numerical studies to assess the performance of the proposed approach and conclude that the proposed approach can lead to much improved estimators when strong clustering effects exist. A dataset from a litter-matched tumorigenesis experiment is used for illustration.
Resumo:
With growing population and fast urbanization in Australia, it is a challenging task to maintain our water quality. It is essential to develop an appropriate statistical methodology in analyzing water quality data in order to draw valid conclusions and hence provide useful advices in water management. This paper is to develop robust rank-based procedures for analyzing nonnormally distributed data collected over time at different sites. To take account of temporal correlations of the observations within sites, we consider the optimally combined estimating functions proposed by Wang and Zhu (Biometrika, 93:459-464, 2006) which leads to more efficient parameter estimation. Furthermore, we apply the induced smoothing method to reduce the computational burden. Smoothing leads to easy calculation of the parameter estimates and their variance-covariance matrix. Analysis of water quality data from Total Iron and Total Cyanophytes shows the differences between the traditional generalized linear mixed models and rank regression models. Our analysis also demonstrates the advantages of the rank regression models for analyzing nonnormal data.