93 resultados para Dynamic data set visualization
Resumo:
The estimation of effective population size from one sample of genotypes has been problematic because most estimators have been proven imprecise or biased. We developed a web-based program, ONeSAMP that uses approximate Bayesian computation to estimate effective population size from a sample of microsatellite genotypes. ONeSAMP requires an input file of sampled individuals' microsatellite genotypes along with information about several sampling and biological parameters. ONeSAMP provides an estimate of effective population size, along with 95% credible limits. We illustrate the use of ONeSAMP with an example data set from a re-introduced population of ibex Capra ibex.
Resumo:
In survival analysis frailty is often used to model heterogeneity between individuals or correlation within clusters. Typically frailty is taken to be a continuous random effect, yielding a continuous mixture distribution for survival times. A Bayesian analysis of a correlated frailty model is discussed in the context of inverse Gaussian frailty. An MCMC approach is adopted and the deviance information criterion is used to compare models. As an illustration of the approach a bivariate data set of corneal graft survival times is analysed. (C) 2006 Elsevier B.V. All rights reserved.
Resumo:
Survival times for the Acacia mangium plantation in the Segaliud Lokan Project, Sabah, East Malaysia were analysed based on 20 permanent sample plots (PSPs) established in 1988 as a spacing experiment. The PSPs were established following a complete randomized block design with five levels of spacing randomly assigned to units within four blocks at different sites. The survival times of trees in years are of interest. Since the inventories were only conducted annually, the actual survival time for each tree was not observed. Hence, the data set comprises censored survival times. Initial analysis of the survival of the Acacia mangium plantation suggested there is block by spacing interaction; a Weibull model gives a reasonable fit to the replicate survival times within each PSP; but a standard Weibull regression model is inappropriate because the shape parameter differs between PSPs. In this paper we investigate the form of the non-constant Weibull shape parameter. Parsimonious models for the Weibull survival times have been derived using maximum likelihood methods. The factor selection for the parameters is based on a backward elimination procedure. The models are compared using likelihood ratio statistics. The results suggest that both Weibull parameters depend on spacing and block.
Resumo:
Our conclusions are unaffected by removal of the time series identified by Peacock and Garshelis as harvest data. The relationship between a population's growth rate and its size is generally concave in mammals, irrespective of their body sizes. However, our data set includes quality data for only five mammals larger than 20 kilograms, so strong conclusions cannot be made about these animals.
Resumo:
Polarized epithelial cells are responsible for the vectorial transport of solutes and have a key role in maintaining body fluid and electrolyte homeostasis. Such cells contain structurally and functionally distinct plasma membrane domains. Brush border and basolateral membranes of renal and intestinal epithelial cells can be separated using a number of different separation techniques, which allow their different transport functions and receptor expressions to be studied. In this communication, we report a proteomic analysis of these two membrane segments, apical and basolateral, obtained from the rat renal cortex isolated by two different methods: differential centrifugation and free-flow electrophoresis. The study was aimed at assessing the nature of the major proteins isolated by these two separation techniques. Two analytical strategies were used: separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) at the protein level or by cation-exchange high-performance liquid chromatography (HPLC) after proteolysis (i.e., at the peptide level). Proteolytic peptides derived from the proteins present in gel pieces or from HPLC fractions after proteolysis were sequenced by on-line liquid chromatography-tandem mass spectrometry (LC-MS/MS). Several hundred proteins were identified in each membrane section. In addition to proteins known to be located at the apical and basolateral membranes, several novel proteins were also identified. In particular, a number of proteins with putative roles in signal transduction were identified in both membranes. To our knowledge, this is the first reported study to try and characterize the membrane proteome of polarized epithelial cells and to provide a data set of the most abundant proteins present in renal proximal tubule cell membranes.
Resumo:
We have developed a new method for the analysis of voids in proteins (defined as empty cavities not accessible to solvent). This method combines analysis of individual discrete voids with analysis of packing quality. While these are different aspects of the same effect, they have traditionally been analysed using different approaches. The method has been applied to the calculation of total void volume and maximum void size in a non-redundant set of protein domains and has been used to examine correlations between thermal stability and void size. The tumour-suppressor protein p53 has then been compared with the non-redundant data set to determine whether its low thermal stability results from poor packing. We found that p53 has average packing, but the detrimental effects of some previously unexplained mutations to p53 observed in cancer can be explained by the creation of unusually large voids. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
Objectives: To assess the potential source of variation that surgeon may add to patient outcome in a clinical trial of surgical procedures. Methods: Two large (n = 1380) parallel multicentre randomized surgical trials were undertaken to compare laparoscopically assisted hysterectomy with conventional methods of abdominal and vaginal hysterectomy; involving 43 surgeons. The primary end point of the trial was the occurrence of at least one major complication. Patients were nested within surgeons giving the data set a hierarchical structure. A total of 10% of patients had at least one major complication, that is, a sparse binary outcome variable. A linear mixed logistic regression model (with logit link function) was used to model the probability of a major complication, with surgeon fitted as a random effect. Models were fitted using the method of maximum likelihood in SAS((R)). Results: There were many convergence problems. These were resolved using a variety of approaches including; treating all effects as fixed for the initial model building; modelling the variance of a parameter on a logarithmic scale and centring of continuous covariates. The initial model building process indicated no significant 'type of operation' across surgeon interaction effect in either trial, the 'type of operation' term was highly significant in the abdominal trial, and the 'surgeon' term was not significant in either trial. Conclusions: The analysis did not find a surgeon effect but it is difficult to conclude that there was not a difference between surgeons. The statistical test may have lacked sufficient power, the variance estimates were small with large standard errors, indicating that the precision of the variance estimates may be questionable.
Resumo:
A Bayesian approach to analysing data from family-based association studies is developed. This permits direct assessment of the range of possible values of model parameters, such as the recombination frequency and allelic associations, in the light of the data. In addition, sophisticated comparisons of different models may be handled easily, even when such models are not nested. The methodology is developed in such a way as to allow separate inferences to be made about linkage and association by including theta, the recombination fraction between the marker and disease susceptibility locus under study, explicitly in the model. The method is illustrated by application to a previously published data set. The data analysis raises some interesting issues, notably with regard to the weight of evidence necessary to convince us of linkage between a candidate locus and disease.
Resumo:
Polarized epithelial cells are responsible for the vectorial transport of solutes and have a key role in maintaining body fluid and electrolyte homeostasis. Such cells contain structurally and functionally distinct plasma membrane domains. Brush border and basolateral membranes of renal and intestinal epithelial cells can be separated using a number of different separation techniques, which allow their different transport functions and receptor expressions to be studied. In this communication, we report a proteomic analysis of these two membrane segments, apical and basolateral, obtained from the rat renal cortex isolated by two different methods: differential centrifugation and free-flow electrophoresis. The study was aimed at assessing the nature of the major proteins isolated by these two separation techniques. Two analytical strategies were used: separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) at the protein level or by cation-exchange high-performance liquid chromatography (HPLC) after proteolysis (i.e., at the peptide level). Proteolytic peptides derived from the proteins present in gel pieces or from HPLC fractions after proteolysis were sequenced by on-line liquid chromatography-tandem mass spectrometry (LC-MS/MS). Several hundred proteins were identified in each membrane section. In addition to proteins known to be located at the apical and basolateral membranes, several novel proteins were also identified. In particular, a number of proteins with putative roles in signal transduction were identified in both membranes. To our knowledge, this is the first reported study to try and characterize the membrane proteome of polarized epithelial cells and to provide a data set of the most abundant proteins present in renal proximal tubule cell membranes.
Resumo:
Thirty one new sodium heterosulfamates, RNHSO3Na, where the R portion contains mainly thiazole, benzothiazole, thiadiazole and pyridine ring structures, have been synthesized and their taste portfolios have been assessed. A database of 132 heterosulfamates ( both open-chain and cyclic) has been formed by combining these new compounds with an existing set of 101 heterosulfamates which were previously synthesized and for which taste data are available. Simple descriptors have been obtained using (i) measurements with Corey-Pauling-Koltun (CPK) space- filling models giving x, y and z dimensions and a volume VCPK, (ii) calculated first order molecular connectivities ((1)chi(v)) and (iii) the calculated Spartan program parameters to obtain HOMO, LUMO energies, the solvation energy E-solv and V-SPART AN. The techniques of linear (LDA) and quadratic (QDA) discriminant analysis and Tree analysis have then been employed to develop structure-taste relationships (SARs) that classify the sweet (S) and non-sweet (N) compounds into separate categories. In the LDA analysis 70% of the compounds were correctly classified ( this compares with 65% when the smaller data set of 101 compounds was used) and in the QDA analysis 68% were correctly classified ( compared to 80% previously). TheTree analysis correctly classified 81% ( compared to 86% previously). An alternative Tree analysis derived using the Cerius2 program and a set of physicochemical descriptors correctly classified only 54% of the compounds.
Resumo:
This paper introduces the findings of a recent study on the use of information technology (IT) among the quantity surveying (QS) organisations in Hong Kong. The study was conducted through a structured questionnaire survey among 18 QS organisations registered in Hong Kong, representing around 53% of the total number of organisations in the profession. The data set generated from this study provided rich information about what information technology the QS profession used, what the perceived benefits and barriers experienced by the users in the industry were. The survey concluded that although IT had been widely used in QS organisations in Hong Kong, it is mainly used to support various individual tasks of the QS services at a basic level, rather than to streamline the production of QS services as a whole through automation. Most of the respondents agreed that IT plays an important role in the QS profession but they had not fully taken advantage of IT to improve their competitive edge in the market. They usually adopted a more passive “wait and see” approach. In addition, very few QS organisations in Hong Kong have a comprehensive policy in promoting the use of IT within the organisations. It is recommended that the QS profession must recognise the importance of IT and take appropriate actions to meet the challenges of ever-changing and competitive market place.
Resumo:
Commonly used repair rate models for repairable systems in the reliability literature are renewal processes, generalised renewal processes or non-homogeneous Poisson processes. In addition to these models, geometric processes (GP) are studied occasionally. The GP, however, can only model systems with monotonously changing (increasing, decreasing or constant) failure intensities. This paper deals with the reliability modelling of failure processes for repairable systems where the failure intensity shows a bathtub-type non-monotonic behaviour. A new stochastic process, i.e. an extended Poisson process, is introduced in this paper. Reliability indices are presented, and the parameters of the new process are estimated. Experimental results on a data set demonstrate the validity of the new process.
Resumo:
The paper introduces an efficient construction algorithm for obtaining sparse linear-in-the-weights regression models based on an approach of directly optimizing model generalization capability. This is achieved by utilizing the delete-1 cross validation concept and the associated leave-one-out test error also known as the predicted residual sums of squares (PRESS) statistic, without resorting to any other validation data set for model evaluation in the model construction process. Computational efficiency is ensured using an orthogonal forward regression, but the algorithm incrementally minimizes the PRESS statistic instead of the usual sum of the squared training errors. A local regularization method can naturally be incorporated into the model selection procedure to further enforce model sparsity. The proposed algorithm is fully automatic, and the user is not required to specify any criterion to terminate the model construction procedure. Comparisons with some of the existing state-of-art modeling methods are given, and several examples are included to demonstrate the ability of the proposed algorithm to effectively construct sparse models that generalize well.
Resumo:
This paper introduces a new neurofuzzy model construction and parameter estimation algorithm from observed finite data sets, based on a Takagi and Sugeno (T-S) inference mechanism and a new extended Gram-Schmidt orthogonal decomposition algorithm, for the modeling of a priori unknown dynamical systems in the form of a set of fuzzy rules. The first contribution of the paper is the introduction of a one to one mapping between a fuzzy rule-base and a model matrix feature subspace using the T-S inference mechanism. This link enables the numerical properties associated with a rule-based matrix subspace, the relationships amongst these matrix subspaces, and the correlation between the output vector and a rule-base matrix subspace, to be investigated and extracted as rule-based knowledge to enhance model transparency. The matrix subspace spanned by a fuzzy rule is initially derived as the input regression matrix multiplied by a weighting matrix that consists of the corresponding fuzzy membership functions over the training data set. Model transparency is explored by the derivation of an equivalence between an A-optimality experimental design criterion of the weighting matrix and the average model output sensitivity to the fuzzy rule, so that rule-bases can be effectively measured by their identifiability via the A-optimality experimental design criterion. The A-optimality experimental design criterion of the weighting matrices of fuzzy rules is used to construct an initial model rule-base. An extended Gram-Schmidt algorithm is then developed to estimate the parameter vector for each rule. This new algorithm decomposes the model rule-bases via an orthogonal subspace decomposition approach, so as to enhance model transparency with the capability of interpreting the derived rule-base energy level. This new approach is computationally simpler than the conventional Gram-Schmidt algorithm for resolving high dimensional regression problems, whereby it is computationally desirable to decompose complex models into a few submodels rather than a single model with large number of input variables and the associated curse of dimensionality problem. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.
Resumo:
A fundamental principle in practical nonlinear data modeling is the parsimonious principle of constructing the minimal model that explains the training data well. Leave-one-out (LOO) cross validation is often used to estimate generalization errors by choosing amongst different network architectures (M. Stone, "Cross validatory choice and assessment of statistical predictions", J. R. Stast. Soc., Ser. B, 36, pp. 117-147, 1974). Based upon the minimization of LOO criteria of either the mean squares of LOO errors or the LOO misclassification rate respectively, we present two backward elimination algorithms as model post-processing procedures for regression and classification problems. The proposed backward elimination procedures exploit an orthogonalization procedure to enable the orthogonality between the subspace as spanned by the pruned model and the deleted regressor. Subsequently, it is shown that the LOO criteria used in both algorithms can be calculated via some analytic recursive formula, as derived in this contribution, without actually splitting the estimation data set so as to reduce computational expense. Compared to most other model construction methods, the proposed algorithms are advantageous in several aspects; (i) There are no tuning parameters to be optimized through an extra validation data set; (ii) The procedure is fully automatic without an additional stopping criteria; and (iii) The model structure selection is directly based on model generalization performance. The illustrative examples on regression and classification are used to demonstrate that the proposed algorithms are viable post-processing methods to prune a model to gain extra sparsity and improved generalization.