910 resultados para Ordered probit regression
Resumo:
1. Fitting a linear regression to data provides much more information about the relationship between two variables than a simple correlation test. A goodness of fit test of the line should always be carried out. Hence, r squared estimates the strength of the relationship between Y and X, ANOVA whether a statistically significant line is present, and the ‘t’ test whether the slope of the line is significantly different from zero. 2. Always check whether the data collected fit the assumptions for regression analysis and, if not, whether a transformation of the Y and/or X variables is necessary. 3. If the regression line is to be used for prediction, it is important to determine whether the prediction involves an individual y value or a mean. Care should be taken if predictions are made close to the extremities of the data and are subject to considerable error if x falls beyond the range of the data. Multiple predictions require correction of the P values. 3. If several individual regression lines have been calculated from a number of similar sets of data, consider whether they should be combined to form a single regression line. 4. If the data exhibit a degree of curvature, then fitting a higher-order polynomial curve may provide a better fit than a straight line. In this case, a test of whether the data depart significantly from a linear regression should be carried out.
Resumo:
Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.
Resumo:
1. The techniques associated with regression, whether linear or non-linear, are some of the most useful statistical procedures that can be applied in clinical studies in optometry. 2. In some cases, there may be no scientific model of the relationship between X and Y that can be specified in advance and the objective may be to provide a ‘curve of best fit’ for predictive purposes. In such cases, the fitting of a general polynomial type curve may be the best approach. 3. An investigator may have a specific model in mind that relates Y to X and the data may provide a test of this hypothesis. Some of these curves can be reduced to a linear regression by transformation, e.g., the exponential and negative exponential decay curves. 4. In some circumstances, e.g., the asymptotic curve or logistic growth law, a more complex process of curve fitting involving non-linear estimation will be required.
Resumo:
Regression problems are concerned with predicting the values of one or more continuous quantities, given the values of a number of input variables. For virtually every application of regression, however, it is also important to have an indication of the uncertainty in the predictions. Such uncertainties are expressed in terms of the error bars, which specify the standard deviation of the distribution of predictions about the mean. Accurate estimate of error bars is of practical importance especially when safety and reliability is an issue. The Bayesian view of regression leads naturally to two contributions to the error bars. The first arises from the intrinsic noise on the target data, while the second comes from the uncertainty in the values of the model parameters which manifests itself in the finite width of the posterior distribution over the space of these parameters. The Hessian matrix which involves the second derivatives of the error function with respect to the weights is needed for implementing the Bayesian formalism in general and estimating the error bars in particular. A study of different methods for evaluating this matrix is given with special emphasis on the outer product approximation method. The contribution of the uncertainty in model parameters to the error bars is a finite data size effect, which becomes negligible as the number of data points in the training set increases. A study of this contribution is given in relation to the distribution of data in input space. It is shown that the addition of data points to the training set can only reduce the local magnitude of the error bars or leave it unchanged. Using the asymptotic limit of an infinite data set, it is shown that the error bars have an approximate relation to the density of data in input space.
Resumo:
A novel direct compression tableting excipient has been made by recrystallisation of lactose. The particles produced had high porosity, high specific surface area and high surface roughness. The resistance to segregation of ordered mixes formed between a model drug; potassium chloride and the excipients recrystallised lactose, spray crystallised maltose-dextrose (Emdexl and a direct compacting sugar (Dipac) was studied using a vibrational segregation model. The highly porous excipients, Emdex and recrystallised lactose formed ordered mixes which did not segregate even at high accelerations and low frequencies whereas the relatively smooth excipient, Dipac, displayed marked segregation in most vibration conditions. The vibrations were related to practical conditions measured in pharmaceutical process machinery. The time required to form an ordered mix was inversely related to the stability of the mix when subjected to vibration. An ultracentrifuge technique was developed to determine the interparticle adhesion forces holding drug and excipient particles together as ordered units. Excipient powders such as Emdex and recrystallised lactose, which formed non-segregating ordered mixes, had high interparticle adhesion forces. Other ordered mixes that segregated when subjected to different vibration conditions were found to have large quantities of weekly-bound drug particles; such mixes included those with Dipac as the carrier excipient as well as those containing a high concentration of drug. The electrostatic properties of different drug and excipient powders were studied using a Faraday well and an electrometer. Excipient powders such as Emdex and recrystallised lactose which formed stable ordered mixes also had a widely different surface charge in comparison with drug particles, whereas Dipac had a similar surface charge to the drug particles and formed unstable ordered mixes. A specially constructed triboelectric charging apparatus based on an air cyclone was developed to increase the affinity of drug particles for different excipient particles. Using triboelectrification to increase the interparticle adhesion forces, the segregation tendencies of unstable ordered mixes were greatly reduced. The stability of ordered mixes is shown to be related to both the surface physical characteristics and the surface electrical properties of the constituent carrier (excipientl particles.
Resumo:
An investigator may also wish to select a small subset of the X variables which give the best prediction of the Y variable. In this case, the question is how many variables should the regression equation include? One method would be to calculate the regression of Y on every subset of the X variables and choose the subset that gives the smallest mean square deviation from the regression. Most investigators, however, prefer to use a ‘stepwise multiple regression’ procedure. There are two forms of this analysis called the ‘step-up’ (or ‘forward’) method and the ‘step-down’ (or ‘backward’) method. This Statnote illustrates the use of stepwise multiple regression with reference to the scenario introduced in Statnote 24, viz., the influence of climatic variables on the growth of the crustose lichen Rhizocarpon geographicum (L.)DC.
Resumo:
The aim of this research work was primarily to examine the relevance of patient parameters, ward structures, procedures and practices, in respect of the potential hazards of wound cross-infection and nasal colonisation with multiple resistant strains of Staphylococcus aureus, which it is thought might provide a useful indication of a patient's general susceptibility to wound infection. Information from a large cross-sectional survey involving 12,000 patients from some 41 hospitals and 375 wards was collected over a five-year period from 1967-72, and its validity checked before any subsequent analysis was carried out. Many environmental factors and procedures which had previously been thought (but never conclusively proved) to have an influence on wound infection or nasal colonisation rates, were assessed, and subsequently dismissed as not being significant, provided that the standard of the current range of practices and procedures is maintained and not allowed to deteriorate. Retrospective analysis revealed that the probability of wound infection was influenced by the patient's age, duration of pre-operative hospitalisation, sex, type of wound, presence and type of drain, number of patients in ward, and other special risk factors, whilst nasal colonisation was found to be influenced by the patient's age, total duration of hospitalisation, sex, antibiotics, proportion of occupied beds in the ward, average distance between bed centres and special risk factors. A multi-variate regression analysis technique was used to develop statistical models, consisting of variable patient and environmental factors which were found to have a significant influence on the risks pertaining to wound infection and nasal colonisation. A relationship between wound infection and nasal colonisation was then established and this led to the development of a more advanced model for predicting wound infections, taking advantage of the additional knowledge of the patient's state of nasal colonisation prior to operation.
Resumo:
This paper explores the use of the optimization procedures in SAS/OR software with application to the ordered weight averaging (OWA) operators of decision-making units (DMUs). OWA was originally introduced by Yager (IEEE Trans Syst Man Cybern 18(1):183-190, 1988) has gained much interest among researchers, hence many applications such as in the areas of decision making, expert systems, data mining, approximate reasoning, fuzzy system and control have been proposed. On the other hand, the SAS is powerful software and it is capable of running various optimization tools such as linear and non-linear programming with all type of constraints. To facilitate the use of OWA operator by SAS users, a code was implemented. The SAS macro developed in this paper selects the criteria and alternatives from a SAS dataset and calculates a set of OWA weights. An example is given to illustrate the features of SAS/OWA software. © Springer-Verlag 2009.
Resumo:
In previous statnotes, the application of correlation and regression methods to the analysis of two variables (X,Y) was described. These methods can be used to determine whether there is a linear relationship between the two variables, whether the relationship is positive or negative, to test the degree of significance of the linear relationship, and to obtain an equation relating Y to X. This Statnote extends the methods of linear correlation and regression to situations where there are two or more X variables, i.e., 'multiple linear regression’.
Resumo:
The practice of evidence-based medicine involves consulting documents from repositories such as Scopus, PubMed, or the Cochrane Library. The most common approach for presenting retrieved documents is in the form of a list, with the assumption that the higher a document is on a list, the more relevant it is. Despite this list-based presentation, it is seldom studied how physicians perceive the importance of the order of documents presented in a list. This paper describes an empirical study that elicited and modeled physicians' preferences with regard to list-based results. Preferences were analyzed using a GRIP method that relies on pairwise comparisons of selected subsets of possible rank-ordered lists composed of 3 documents. The results allow us to draw conclusions regarding physicians' attitudes towards the importance of having documents ranked correctly on a result list, versus the importance of retrieving relevant but misplaced documents. Our findings should help developers of clinical information retrieval applications when deciding how retrieved documents should be presented and how performance of the application should be assessed. © 2012 Springer-Verlag Berlin Heidelberg.
Resumo:
Direct quantile regression involves estimating a given quantile of a response variable as a function of input variables. We present a new framework for direct quantile regression where a Gaussian process model is learned, minimising the expected tilted loss function. The integration required in learning is not analytically tractable so to speed up the learning we employ the Expectation Propagation algorithm. We describe how this work relates to other quantile regression methods and apply the method on both synthetic and real data sets. The method is shown to be competitive with state of the art methods whilst allowing for the leverage of the full Gaussian process probabilistic framework.
Resumo:
The component spectra of a mixture of isomers with nearly identical diffusion coefficients cannot normally be distinguished in a standard diffusion-ordered spectroscopy (DOSY) experiment but can often be easily resolved using matrix-assisted DOSY, in which diffusion behaviour is manipulated by the addition of a co-solute such as a surfactant. Relatively little is currently known about the conditions required for such a separation, for example, how the choice between normal and reverse micelles affects separation or how the isomer structures themselves affect the resolution. The aim of this study was to explore the application of sodium dodecyl sulfate (SDS) normal micelles in aqueous solution and sodium 1,4-bis(2-ethylhexyl)sulfosuccinate (AOT) aggregates in chloroform, at a range of concentrations, to the diffusion resolution of some simple model sets of isomers such as monomethoxyphenols and short chain alcohols. It is shown that SDS micelles offer better resolution where these isomers differ in the position of a hydroxyl group, whereas AOT aggregates are more effective for isomers differing in the position of a methyl group. For both the normal SDS micelles and the less well-defined AOT aggregates, differences in the resolution of the isomers can in part be rationalised in terms of differing degrees of hydrophobicity, amphiphilicity and steric effects. Copyright © 2012 John Wiley & Sons, Ltd.
Resumo:
Diffusion-ordered spectroscopy (DOSY) is a powerful technique for mixture analysis, but in its basic form it cannot separate the component spectra for species with very similar diffusion coefficients. It has been recently demonstrated that the component spectra of a mixture of isomers with nearly identical diffusion coefficients (the three dihydroxybenzenes) can be resolved using matrix-assisted DOSY (MAD), in which diffusion is perturbed by the addition of a co-solute such as a surfactant [R. Evans, S. Haiber, M. Nilsson, G. A. Morris, Anal. Chem. 2009, 81, 4548-4550]. However, little is known about the conditions required for such a separation, for example, the concentrations and concentration ratios of surfactant and solutes. The aim of this study was to explore the concentration range over whichmatrix-assisted DOSY using the surfactant SDS can achieve diffusion resolution of a simple model set of isomers, the monomethoxyphenols. The results show that the separation is remarkably robust with respect to both the concentrations and the concentration ratios of surfactant and solutes, supporting the idea that MAD may become a valuable tool formixture analysis. © 2010 John Wiley & Sons, Ltd.
Resumo:
Diffusion-ordered NMR spectroscopy ("DOSY") is a useful tool for the identification of mixture components. In its basic form it relies on simple differences in hydrodynamic radius to distinguish between different species. This can be very effective where species have significantly different molecular sizes, but generally fails for isomeric species. The use of surfactant co-solutes can allow isomeric species to be distinguished by virtue of their different degrees of interaction with micelles or reversed micelles. The use of micelle-assisted DOSY to resolve the NMR spectra of isomers is illustrated for the case of the three dihydroxybenzenes (catechol, resorcinol, and hydroquinone) in aqueous solution containing sodium dodecyl sulfate micelles, and in chloroform solution containing AOT reversed micelles. © 2009 American Chemical Society.