873 resultados para linear mixing model
Finite mixture regression model with random effects: application to neonatal hospital length of stay
Resumo:
A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Objective: The objective of the present study is to test the validity of the integrated cognitive model (ICM) of depression proposed by Kwon and Oei with a Latin-American sample. The ICM of depression postulates that the interaction between negative life events with dysfunctional attitudes increases the frequency of negative automatic thoughts, which in turns affects the depressive symptomatology of a person. This model was developed for Western Europeans such as Americans and Australians and the validity of this model has not been tested on Latin-Americans. Method: Participants were 101 Latin-American migrants living permanently in Brisbane, including people from Chile, El Salvador, Nicaragua, Argentina and Guatemala. Participants completed the Beck Depression Inventory, the Dysfunctional Attitudes Scale, the Automatic Thoughts Questionnaire and the Life Events Inventory. Alternative or competing models of depression were examined, including the alternative aetiologies model, the linear mediational model and the symptom model. Results: Six models were tested and the results of the structural equation modelling analysis indicated that the symptom model only fits the Latin-American data. Conclusions: Results show that in the Latin-American sample depression symptoms can have an impact on negative cognitions. This finding adds to growing evidence in the literature that the relationship between cognitions and depression is bidirectional, rather than unidirectional from cognitions to symptoms.
Resumo:
Based on the three-dimensional elastic inclusion model proposed by Dobrovolskii, we developed a rheological inclusion model to study earthquake preparation processes. By using the Corresponding Principle in the theory of rheologic mechanics, we derived the analytic expressions of viscoelastic displacement U(r, t) , V(r, t) and W(r, t), normal strains epsilon(xx) (r, t), epsilon(yy) (r, t) and epsilon(zz) (r, t) and the bulk strain theta (r, t) at an arbitrary point (x, y, z) in three directions of X axis, Y axis and Z axis produced by a three-dimensional inclusion in the semi-infinite rheologic medium defined by the standard linear rheologic model. Subsequent to the spatial-temporal variation of bulk strain being computed on the ground produced by such a spherical rheologic inclusion, interesting results are obtained, suggesting that the bulk strain produced by a hard inclusion change with time according to three stages (alpha, beta, gamma) with different characteristics, similar to that of geodetic deformation observations, but different with the results of a soft inclusion. These theoretical results can be used to explain the characteristics of spatial-temporal evolution, patterns, quadrant-distribution of earthquake precursors, the changeability, spontaneity and complexity of short-term and imminent-term precursors. It offers a theoretical base to build physical models for earthquake precursors and to predict the earthquakes.
Resumo:
E. L. DeLosh, J. R. Busemeyer, and M. A. McDaniel (1997) found that when learning a positive, linear relationship between a continuous predictor (x) and a continuous criterion (y), trainees tend to underestimate y on items that ask the trainee to extrapolate. In 3 experiments, the authors examined the phenomenon and found that the tendency to underestimate y is reliable only in the so-called lower extrapolation region-that is, new values of x that lie between zero and the edge of the training region. Existing models of function learning, such as the extrapolation-association model (DeLosh et al., 1997) and the population of linear experts model (M. L. Kalish, S. Lewandowsky, & J. Kruschke, 2004), cannot account for these results. The authors show that with minor changes, both models can predict the correct pattern of results.
Resumo:
Exploratory analysis of data in all sciences seeks to find common patterns to gain insights into the structure and distribution of the data. Typically visualisation methods like principal components analysis are used but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this technical report we discuss a complementary approach based on a non-linear probabilistic model. The generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate far more structure than a two dimensional principal components plot could, and deal at the same time with missing data. We show that using the generative topographic mapping provides us with an optimal method to explore the data while being able to replace missing values in a dataset, particularly where a large proportion of the data is missing.
Resumo:
For a submitted query to multiple search engines finding relevant results is an important task. This paper formulates the problem of aggregation and ranking of multiple search engines results in the form of a minimax linear programming model. Besides the novel application, this study detects the most relevant information among a return set of ranked lists of documents retrieved by distinct search engines. Furthermore, two numerical examples aree used to illustrate the usefulness of the proposed approach.
Resumo:
Exploratory analysis of data seeks to find common patterns to gain insights into the structure and distribution of the data. In geochemistry it is a valuable means to gain insights into the complicated processes making up a petroleum system. Typically linear visualisation methods like principal components analysis, linked plots, or brushing are used. These methods can not directly be employed when dealing with missing data and they struggle to capture global non-linear structures in the data, however they can do so locally. This thesis discusses a complementary approach based on a non-linear probabilistic model. The generative topographic mapping (GTM) enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate more structure than a two dimensional principal components plot. The model can deal with uncertainty, missing data and allows for the exploration of the non-linear structure in the data. In this thesis a novel approach to initialise the GTM with arbitrary projections is developed. This makes it possible to combine GTM with algorithms like Isomap and fit complex non-linear structure like the Swiss-roll. Another novel extension is the incorporation of prior knowledge about the structure of the covariance matrix. This extension greatly enhances the modelling capabilities of the algorithm resulting in better fit to the data and better imputation capabilities for missing data. Additionally an extensive benchmark study of the missing data imputation capabilities of GTM is performed. Further a novel approach, based on missing data, will be introduced to benchmark the fit of probabilistic visualisation algorithms on unlabelled data. Finally the work is complemented by evaluating the algorithms on real-life datasets from geochemical projects.
Resumo:
Exploratory analysis of petroleum geochemical data seeks to find common patterns to help distinguish between different source rocks, oils and gases, and to explain their source, maturity and any intra-reservoir alteration. However, at the outset, one is typically faced with (a) a large matrix of samples, each with a range of molecular and isotopic properties, (b) a spatially and temporally unrepresentative sampling pattern, (c) noisy data and (d) often, a large number of missing values. This inhibits analysis using conventional statistical methods. Typically, visualisation methods like principal components analysis are used, but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this paper we introduce a complementary approach based on a non-linear probabilistic model. Generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, while also dealing with missing data. We show how using generative topographic mapping also provides an optimal method with which to replace missing values in two geochemical datasets, particularly where a large proportion of the data is missing.
Resumo:
The measurement of different aspects of information society has been problematic over along time, and the International Telecommunication Union (ITU) is spearheading in developing a single ICT index. In Geneva during the first World Summit on Information Society (WSIS) in December 2003, the heads of states declared their commitment to the importance of benchmarking and measuring progress toward the information society. Consequently, they re-affirmed their Geneva commitments in their second summit held in Tunis in 2005. In this paper, we propose a multiplicative linear programming model to measure Opportunity Index. We also compared our results with the common measure of ICT opportunity index and we found that the two indices are consistent in their measurement of digital opportunity though differences still exist among regions.
Resumo:
2000 Mathematics Subject Classification: 62J12, 62F35
Resumo:
The cell:cell bond between an immune cell and an antigen presenting cell is a necessary event in the activation of the adaptive immune response. At the juncture between the cells, cell surface molecules on the opposing cells form non-covalent bonds and a distinct patterning is observed that is termed the immunological synapse. An important binding molecule in the synapse is the T-cell receptor (TCR), that is responsible for antigen recognition through its binding with a major-histocompatibility complex with bound peptide (pMHC). This bond leads to intracellular signalling events that culminate in the activation of the T-cell, and ultimately leads to the expression of the immune eector function. The temporal analysis of the TCR bonds during the formation of the immunological synapse presents a problem to biologists, due to the spatio-temporal scales (nanometers and picoseconds) that compare with experimental uncertainty limits. In this study, a linear stochastic model, derived from a nonlinear model of the synapse, is used to analyse the temporal dynamics of the bond attachments for the TCR. Mathematical analysis and numerical methods are employed to analyse the qualitative dynamics of the nonequilibrium membrane dynamics, with the specic aim of calculating the average persistence time for the TCR:pMHC bond. A single-threshold method, that has been previously used to successfully calculate the TCR:pMHC contact path sizes in the synapse, is applied to produce results for the average contact times of the TCR:pMHC bonds. This method is extended through the development of a two-threshold method, that produces results suggesting the average time persistence for the TCR:pMHC bond is in the order of 2-4 seconds, values that agree with experimental evidence for TCR signalling. The study reveals two distinct scaling regimes in the time persistent survival probability density prole of these bonds, one dominated by thermal uctuations and the other associated with the TCR signalling. Analysis of the thermal fluctuation regime reveals a minimal contribution to the average time persistence calculation, that has an important biological implication when comparing the probabilistic models to experimental evidence. In cases where only a few statistics can be gathered from experimental conditions, the results are unlikely to match the probabilistic predictions. The results also identify a rescaling relationship between the thermal noise and the bond length, suggesting a recalibration of the experimental conditions, to adhere to this scaling relationship, will enable biologists to identify the start of the signalling regime for previously unobserved receptor:ligand bonds. Also, the regime associated with TCR signalling exhibits a universal decay rate for the persistence probability, that is independent of the bond length.
Resumo:
Trees and shrubs in tropical Africa use the C3 cycle as a carbon fixation pathway during photosynthesis, while grasses and sedges mostly use the C4 cycle. Leaf-wax lipids from sedimentary archives such as the long-chain n-alkanes (e.g., n-C27 to n-C33) inherit carbon isotope ratios that are representative of the carbon fixation pathway. Therefore, n-alkane d13C values are often used to reconstruct past C3/C4 composition of vegetation, assuming that the relative proportions of C3 and C4 leaf waxes reflect the relative proportions of C3 and C4 plants. We have compared the d13C values of n-alkanes from modern C3 and C4 plants with previously published values from recent lake sediments and provide a framework for estimating the fractional contribution (areal-based) of C3 vegetation cover (fC3) represented by these sedimentary archives. Samples were collected in Cameroon, across a latitudinal transect that accommodates a wide range of climate zones and vegetation types, as reflected in the progressive northward replacement of C3-dominated rain forest by C4-dominated savanna. The C3 plants analysed were characterised by substantially higher abundances of n-C29 alkanes and by substantially lower abundances of n-C33 alkanes than the C4 plants. Furthermore, the sedimentary d13C values of n-C29 and n-C31 alkanes from recent lake sediments in Cameroon (-37.4 per mil to -26.5 per mil) were generally within the range of d13C values for C3 plants, even when from sites where C4 plants dominated the catchment vegetation. In such cases simple linear mixing models fail to accurately reconstruct the relative proportions of C3 and C4 vegetation cover when using the d13C values of sedimentary n-alkanes, overestimating the proportion of C3 vegetation, likely as a consequence of the differences in plant wax production, preservation, transport, and/or deposition between C3 and C4 plants. We therefore tested a set of non-linear binary mixing models using d13C values from both C3 and C4 vegetation as end-members. The non-linear models included a sigmoid function (sine-squared) that describes small variations in the fC3 values as the minimum and maximum d13C values are approached, and a hyperbolic function that takes into account the differences between C3 and C4 plants discussed above. Model fitting and the estimation of uncertainties were completed using the Monte Carlo algorithm and can be improved by future data addition. Models that provided the best fit with the observed d13C values of sedimentary n-alkanes were either hyperbolic functions or a combination of hyperbolic and sine-squared functions. Such non-linear models may be used to convert d13C measurements on sedimentary n-alkanes directly into reconstructions of C3 vegetation cover.
Resumo:
A novel surrogate model is proposed in lieu of computational fluid dynamic (CFD) code for fast nonlinear aerodynamic modeling. First, a nonlinear function is identified on selected interpolation points defined by discrete empirical interpolation method (DEIM). The flow field is then reconstructed by a least square approximation of flow modes extracted by proper orthogonal decomposition (POD). The proposed model is applied in the prediction of limit cycle oscillation for a plunge/pitch airfoil and a delta wing with linear structural model, results are validate against a time accurate CFD-FEM code. The results show the model is able to replicate the aerodynamic forces and flow fields with sufficient accuracy while requiring a fraction of CFD cost.
Resumo:
Statistical association between a single nucleotide polymorphism (SNP) genotype and a quantitative trait in genome-wide association studies is usually assessed using a linear regression model, or, in the case of non-normally distributed trait values, using the Kruskal-Wallis test. While linear regression models assume an additive mode of inheritance via equi-distant genotype scores, Kruskal-Wallis test merely tests global differences in trait values associated with the three genotype groups. Both approaches thus exhibit suboptimal power when the underlying inheritance mode is dominant or recessive. Furthermore, these tests do not perform well in the common situations when only a few trait values are available in a rare genotype category (disbalance), or when the values associated with the three genotype categories exhibit unequal variance (variance heterogeneity). We propose a maximum test based on Marcus-type multiple contrast test for relative effect sizes. This test allows model-specific testing of either dominant, additive or recessive mode of inheritance, and it is robust against variance heterogeneity. We show how to obtain mode-specific simultaneous confidence intervals for the relative effect sizes to aid in interpreting the biological relevance of the results. Further, we discuss the use of a related all-pairwise comparisons contrast test with range preserving confidence intervals as an alternative to Kruskal-Wallis heterogeneity test. We applied the proposed maximum test to the Bogalusa Heart Study dataset, and gained a remarkable increase in the power to detect association, particularly for rare genotypes. Our simulation study also demonstrated that the proposed non-parametric tests control family-wise error rate in the presence of non-normality and variance heterogeneity contrary to the standard parametric approaches. We provide a publicly available R library nparcomp that can be used to estimate simultaneous confidence intervals or compatible multiplicity-adjusted p-values associated with the proposed maximum test.
Resumo:
The role of computer modeling has grown recently to integrate itself as an inseparable tool to experimental studies for the optimization of automotive engines and the development of future fuels. Traditionally, computer models rely on simplified global reaction steps to simulate the combustion and pollutant formation inside the internal combustion engine. With the current interest in advanced combustion modes and injection strategies, this approach depends on arbitrary adjustment of model parameters that could reduce credibility of the predictions. The purpose of this study is to enhance the combustion model of KIVA, a computational fluid dynamics code, by coupling its fluid mechanics solution with detailed kinetic reactions solved by the chemistry solver, CHEMKIN. As a result, an engine-friendly reaction mechanism for n-heptane was selected to simulate diesel oxidation. Each cell in the computational domain is considered as a perfectly-stirred reactor which undergoes adiabatic constant- volume combustion. The model was applied to an ideally-prepared homogeneous- charge compression-ignition combustion (HCCI) and direct injection (DI) diesel combustion. Ignition and combustion results show that the code successfully simulates the premixed HCCI scenario when compared to traditional combustion models. Direct injection cases, on the other hand, do not offer a reliable prediction mainly due to the lack of turbulent-mixing model, inherent in the perfectly-stirred reactor formulation. In addition, the model is sensitive to intake conditions and experimental uncertainties which require implementation of enhanced predictive tools. It is recommended that future improvements consider turbulent-mixing effects as well as optimization techniques to accurately simulate actual in-cylinder process with reduced computational cost. Furthermore, the model requires the extension of existing fuel oxidation mechanisms to include pollutant formation kinetics for emission control studies.