244 resultados para regression discrete models
em Queensland University of Technology - ePrints Archive
Resumo:
Invasion waves of cells play an important role in development, disease and repair. Standard discrete models of such processes typically involve simulating cell motility, cell proliferation and cell-to-cell crowding effects in a lattice-based framework. The continuum-limit description is often given by a reaction–diffusion equation that is related to the Fisher–Kolmogorov equation. One of the limitations of a standard lattice-based approach is that real cells move and proliferate in continuous space and are not restricted to a predefined lattice structure. We present a lattice-free model of cell motility and proliferation, with cell-to-cell crowding effects, and we use the model to replicate invasion wave-type behaviour. The continuum-limit description of the discrete model is a reaction–diffusion equation with a proliferation term that is different from lattice-based models. Comparing lattice based and lattice-free simulations indicates that both models lead to invasion fronts that are similar at the leading edge, where the cell density is low. Conversely, the two models make different predictions in the high density region of the domain, well behind the leading edge. We analyse the continuum-limit description of the lattice based and lattice-free models to show that both give rise to invasion wave type solutions that move with the same speed but have very different shapes. We explore the significance of these differences by calibrating the parameters in the standard Fisher–Kolmogorov equation using data from the lattice-free model. We conclude that estimating parameters using this kind of standard procedure can produce misleading results.
Resumo:
Most statistical methods use hypothesis testing. Analysis of variance, regression, discrete choice models, contingency tables, and other analysis methods commonly used in transportation research share hypothesis testing as the means of making inferences about the population of interest. Despite the fact that hypothesis testing has been a cornerstone of empirical research for many years, various aspects of hypothesis tests commonly are incorrectly applied, misinterpreted, and ignored—by novices and expert researchers alike. On initial glance, hypothesis testing appears straightforward: develop the null and alternative hypotheses, compute the test statistic to compare to a standard distribution, estimate the probability of rejecting the null hypothesis, and then make claims about the importance of the finding. This is an oversimplification of the process of hypothesis testing. Hypothesis testing as applied in empirical research is examined here. The reader is assumed to have a basic knowledge of the role of hypothesis testing in various statistical methods. Through the use of an example, the mechanics of hypothesis testing is first reviewed. Then, five precautions surrounding the use and interpretation of hypothesis tests are developed; examples of each are provided to demonstrate how errors are made, and solutions are identified so similar errors can be avoided. Remedies are provided for common errors, and conclusions are drawn on how to use the results of this paper to improve the conduct of empirical research in transportation.
Resumo:
Diffusion equations that use time fractional derivatives are attractive because they describe a wealth of problems involving non-Markovian Random walks. The time fractional diffusion equation (TFDE) is obtained from the standard diffusion equation by replacing the first-order time derivative with a fractional derivative of order α ∈ (0, 1). Developing numerical methods for solving fractional partial differential equations is a new research field and the theoretical analysis of the numerical methods associated with them is not fully developed. In this paper an explicit conservative difference approximation (ECDA) for TFDE is proposed. We give a detailed analysis for this ECDA and generate discrete models of random walk suitable for simulating random variables whose spatial probability density evolves in time according to this fractional diffusion equation. The stability and convergence of the ECDA for TFDE in a bounded domain are discussed. Finally, some numerical examples are presented to show the application of the present technique.
Resumo:
Cell invasion involves a population of cells which are motile and proliferative. Traditional discrete models of proliferation involve agents depositing daughter agents on nearest- neighbor lattice sites. Motivated by time-lapse images of cell invasion, we propose and analyze two new discrete proliferation models in the context of an exclusion process with an undirected motility mechanism. These discrete models are related to a family of reaction- diffusion equations and can be used to make predictions over a range of scales appropriate for interpreting experimental data. The new proliferation mechanisms are biologically relevant and mathematically convenient as the continuum-discrete relationship is more robust for the new proliferation mechanisms relative to traditional approaches.
Resumo:
Optimal design for generalized linear models has primarily focused on univariate data. Often experiments are performed that have multiple dependent responses described by regression type models, and it is of interest and of value to design the experiment for all these responses. This requires a multivariate distribution underlying a pre-chosen model for the data. Here, we consider the design of experiments for bivariate binary data which are dependent. We explore Copula functions which provide a rich and flexible class of structures to derive joint distributions for bivariate binary data. We present methods for deriving optimal experimental designs for dependent bivariate binary data using Copulas, and demonstrate that, by including the dependence between responses in the design process, more efficient parameter estimates are obtained than by the usual practice of simply designing for a single variable only. Further, we investigate the robustness of designs with respect to initial parameter estimates and Copula function, and also show the performance of compound criteria within this bivariate binary setting.
Resumo:
Breast cancer is a leading contributor to the burden of disease in Australia. Fortunately, the recent introduction of diverse therapeutic strategies have improved the survival outcome for many women. Despite this, the clinical management of breast cancer remains problematic as not all approaches are sufficiently sophisticated to take into account the heterogeneity of this disease and are unable to predict disease progression, in particular, metastasis. As such, women with good prognostic outcomes are exposed to the side effects of therapies without added benefit. Furthermore, women with aggressive disease for whom these advanced treatments would deliver benefit cannot be distinguished and opportunities for more intensive or novel treatment are lost. This study is designed to identify novel factors associated with disease progression, and the potential to inform disease prognosis. Frequently overlooked, yet common mediators of disease are the interactions that take place between the insulin-like growth factor (IGF) system and the extracellular matrix (ECM). Our laboratory has previously demonstrated that multiprotein insulin-like growth factor-I (IGF-I): insulin-like growth factor binding protein (IGFBP): vitronectin (VN) complexes stimulate migration of breast cancer cells in vitro, via the cooperative involvement of the insulin-like growth factor type I receptor (IGF-IR) and VN-binding integrins. However, the effects of IGF and ECM protein interactions on the dissemination and progression of breast cancer in vivo are unknown. It was hypothesised that interactions between proteins required for IGF induced signalling events and those within the ECM contribute to breast cancer metastasis and are prognostic and predictive indicators of patient outcome. To address this hypothesis, semiquantitative immunohistochemistry (IHC) analyses were performed to compare the extracellular and subcellular distribution of IGF and ECM induced signalling proteins between matched normal, primary cancer, and metastatic cancer among archival formalin-fixed paraffin-embedded (FFPE) breast tissue samples collected from women attending the Princess Alexandra Hospital, Brisbane. Multivariate Cox proportional hazards (PH) regression survival models in conjunction with a modified „purposeful selection of covariates. method were applied to determine the prognostic potential of these proteins. This study provides the first in-depth, compartmentalised analysis of the distribution of IGF and ECM induced signalling proteins. As protein function and protein localisation are closely correlated, these findings provide novel insights into IGF signalling and ECM protein function during breast cancer development and progression. Distinct IGF signalling and ECM protein immunoreactivity was observed in the stroma and/or in subcellular locations in normal breast, primary cancer and metastatic cancer tissues. Analysis of the presence and location of stratifin (SFN) suggested a causal relationship in ECM remodelling events during breast cancer development and progression. The results of this study have also suggested that fibronectin (FN) and ¥â1 integrin are important for the formation of invadopodia and epithelial-to-mesenchymal transition (EMT) events. Our data also highlighted the importance of the temporal and spatial distribution of IGF induced signalling proteins in breast cancer metastasis; in particular, SFN, enhancer-of-split and hairy-related protein 2 (SHARP-2), total-akt/protein kinase B 1 (Total-AKT1), phosphorylated-akt/protein kinase B (P-AKT), extracellular signal-related kinase-1 and extracellular signal-related kinase-2 (ERK1/2) and phosphorylated-extracellular signal-related kinase-1 and extracellular signal-related kinase-2 (P-ERK1/2). Multivariate survival models were created from the immunohistochemical data. These models were found to fit well with these data with very high statistical confidence. Numerous prognostic confounding effects and effect modifications were identified among elements of the ECM and IGF signalling cascade and corroborate the survival models. This finding provides further evidence for the prognostic potential of IGF and ECM induced signalling proteins. In addition, the adjusted measures of associations obtained in this study have strengthened the validity and utility of the resulting models. The findings from this study provide insights into the biological interactions that occur during the development of breast tissue and contribute to disease progression. Importantly, these multivariate survival models could provide important prognostic and predictive indicators that assist the clinical management of breast disease, namely in the early identification of cancers with a propensity to metastasise, and/or recur following adjuvant therapy. The outcomes of this study further inform the development of new therapeutics to aid patient recovery. The findings from this study have widespread clinical application in the diagnosis of disease and prognosis of disease progression, and inform the most appropriate clinical management of individuals with breast cancer.
Resumo:
We define a pair-correlation function that can be used to characterize spatiotemporal patterning in experimental images and snapshots from discrete simulations. Unlike previous pair-correlation functions, the pair-correlation functions developed here depend on the location and size of objects. The pair-correlation function can be used to indicate complete spatial randomness, aggregation or segregation over a range of length scales, and quantifies spatial structures such as the shape, size and distribution of clusters. Comparing pair-correlation data for various experimental and simulation images illustrates their potential use as a summary statistic for calibrating discrete models of various physical processes.
Resumo:
Early detection, clinical management and disease recurrence monitoring are critical areas in cancer treatment in which specific biomarker panels are likely to be very important in each of these key areas. We have previously demonstrated that levels of alpha-2-heremans-schmid-glycoprotein (AHSG), complement component C3 (C3), clusterin (CLI), haptoglobin (HP) and serum amyloid A (SAA) are significantly altered in serum from patients with squamous cell carcinoma of the lung. Here, we report the abundance levels for these proteins in serum samples from patients with advanced breast cancer, colorectal cancer (CRC) and lung cancer compared to healthy controls (age and gender matched) using commercially available enzyme-linked immunosorbent assay kits. Logistic regression (LR) models were fitted to the resulting data, and the classification ability of the proteins was evaluated using receiver-operating characteristic curve and leave-one-out cross-validation (LOOCV). The most accurate individual candidate biomarkers were C3 for breast cancer [area under the curve (AUC) = 0.89, LOOCV = 73%], CLI for CRC (AUC = 0.98, LOOCV = 90%), HP for small cell lung carcinoma (AUC = 0.97, LOOCV = 88%), C3 for lung adenocarcinoma (AUC = 0.94, LOOCV = 89%) and HP for squamous cell carcinoma of the lung (AUC = 0.94, LOOCV = 87%). The best dual combination of biomarkers using LR analysis were found to be AHSG + C3 (AUC = 0.91, LOOCV = 83%) for breast cancer, CLI + HP (AUC = 0.98, LOOCV = 92%) for CRC, C3 + SAA (AUC = 0.97, LOOCV = 91%) for small cell lung carcinoma and HP + SAA for both adenocarcinoma (AUC = 0.98, LOOCV = 96%) and squamous cell carcinoma of the lung (AUC = 0.98, LOOCV = 84%). The high AUC values reported here indicated that these candidate biomarkers have the potential to discriminate accurately between control and cancer groups both individually and in combination with other proteins. Copyright © 2011 UICC.
Resumo:
This study constructs performance prediction models to estimate the end-user perceived video quality on mobile devices for the latest video encoding techniques –VP9 and H.265. Both subjective and objective video quality assessments were carried out for collecting data and selecting the most desirable predictors. Using statistical regression, two models were generated to achieve 94.5% and 91.5% of prediction accuracies respectively, depending on whether the predictor derived from the objective assessment is involved. These proposed models can be directly used by media industries for video quality estimation, and will ultimately help them to ensure a positive end-user quality of experience on future mobile devices after the adaptation of the latest video encoding technologies.
Resumo:
We consider estimating the total load from frequent flow data but less frequent concentration data. There are numerous load estimation methods available, some of which are captured in various online tools. However, most estimators are subject to large biases statistically, and their associated uncertainties are often not reported. This makes interpretation difficult and the estimation of trends or determination of optimal sampling regimes impossible to assess. In this paper, we first propose two indices for measuring the extent of sampling bias, and then provide steps for obtaining reliable load estimates that minimizes the biases and makes use of informative predictive variables. The key step to this approach is in the development of an appropriate predictive model for concentration. This is achieved using a generalized rating-curve approach with additional predictors that capture unique features in the flow data, such as the concept of the first flush, the location of the event on the hydrograph (e.g. rise or fall) and the discounted flow. The latter may be thought of as a measure of constituent exhaustion occurring during flood events. Forming this additional information can significantly improve the predictability of concentration, and ultimately the precision with which the pollutant load is estimated. We also provide a measure of the standard error of the load estimate which incorporates model, spatial and/or temporal errors. This method also has the capacity to incorporate measurement error incurred through the sampling of flow. We illustrate this approach for two rivers delivering to the Great Barrier Reef, Queensland, Australia. One is a data set from the Burdekin River, and consists of the total suspended sediment (TSS) and nitrogen oxide (NO(x)) and gauged flow for 1997. The other dataset is from the Tully River, for the period of July 2000 to June 2008. For NO(x) Burdekin, the new estimates are very similar to the ratio estimates even when there is no relationship between the concentration and the flow. However, for the Tully dataset, by incorporating the additional predictive variables namely the discounted flow and flow phases (rising or recessing), we substantially improved the model fit, and thus the certainty with which the load is estimated.
Resumo:
In the context of increasing threats to the sensitive marine ecosystem by toxic metals, this study investigated the metal build-up on impervious surfaces specific to commercial seaports. The knowledge generated in this study will contribute to managing toxic metal pollution of the marine ecosystem. The study found that inter-modal operations and main access roadway had the highest loads followed by container storage and vehicle marshalling sites, while the quay line and short term storage areas had the lowest. Additionally, it was found that Cr, Al, Pb, Cu and Zn were predominantly attached to solids, while significant amount of Cu, Pb and Zn were found as nutrient complexes. As such, treatment options based on solids retention can be effective for some metal species, while ineffective for other species. Furthermore, Cu and Zn are more likely to become bioavailable in seawater due to their strong association with nutrients. Mathematical models to replicate the metal build-up process were also developed using experimental design approach and partial least square regression. The models for Cr and Pb were found to be reliable, while those for Al, Zn and Cu were relatively less reliable, but could be employed for preliminary investigations.
Resumo:
In this thesis, the issue of incorporating uncertainty for environmental modelling informed by imagery is explored by considering uncertainty in deterministic modelling, measurement uncertainty and uncertainty in image composition. Incorporating uncertainty in deterministic modelling is extended for use with imagery using the Bayesian melding approach. In the application presented, slope steepness is shown to be the main contributor to total uncertainty in the Revised Universal Soil Loss Equation. A spatial sampling procedure is also proposed to assist in implementing Bayesian melding given the increased data size with models informed by imagery. Measurement error models are another approach to incorporating uncertainty when data is informed by imagery. These models for measurement uncertainty, considered in a Bayesian conditional independence framework, are applied to ecological data generated from imagery. The models are shown to be appropriate and useful in certain situations. Measurement uncertainty is also considered in the context of change detection when two images are not co-registered. An approach for detecting change in two successive images is proposed that is not affected by registration. The procedure uses the Kolmogorov-Smirnov test on homogeneous segments of an image to detect change, with the homogeneous segments determined using a Bayesian mixture model of pixel values. Using the mixture model to segment an image also allows for uncertainty in the composition of an image. This thesis concludes by comparing several different Bayesian image segmentation approaches that allow for uncertainty regarding the allocation of pixels to different ground components. Each segmentation approach is applied to a data set of chlorophyll values and shown to have different benefits and drawbacks depending on the aims of the analysis.
Resumo:
There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states—perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of “excess” zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to “excess” zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed—and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros
Resumo:
We examine the impact of individual-specific information processing strategies (IPSs) on the inclusion/exclusion of attributes on the parameter estimates and behavioural outputs of models of discrete choice. Current practice assumes that individuals employ a homogenous IPS with regards to how they process attributes of stated choice (SC) experiments. We show how information collected exogenous of the SC experiment on whether respondents either ignored or considered each attribute may be used in the estimation process, and how such information provides outputs that are IPS segment specific. We contend that accounting the inclusion/exclusion of attributes will result in behaviourally richer population parameter estimates.